I want to get: <module><java>Client.jar</java></module> out of something like this: <module><java>Other.jar</java></module><module><java>Client.jar</java></module><module><java>Other.jar</java></module>
I tried this regular expression: regexp="(<module(.)*?My_Client.jar(.)*?module>) "
But that gives me this: <module><java>Other.jar</java></module><module><java>Client.jar</java></module>
So the ending is good in that the reluctant quantifier stops after the first </module> is found. But what do I do at the beginning so it starts with the <module> before the client jar?
Note: This is from Ant regular expressions, but it should be similar enough to Java's.
[edited to disable smilies] [ September 15, 2005: Message edited by: Jeanne Boyarsky ]
Is there some reason why you can't just do this: BTW, never use (.)* in a regex; it's incredibly inefficient and doesn't do anything useful. Either put the asterisk inside the parens or get rid of the parens.
Alan, Thanks for the note about the redundant parens. I can't do something simple like because that is a simplified version of what I am trying to match. The module tag has an attribute whose value is unknown (to the code running the regular expression.)
Ken, Thanks for the lead. It didn't work as is in Ant, but the language might be slightly different. The final regular expression that did work is:
It looks like the key is doing a greedy match before the expression.
Originally posted by Ken Blair: EDIT: Remove the Client.jar to get something more generic that you can use in a Pattern to find each one and not just one with a Client.jar in it.
Definitely! Luckily, I know how to do that part. I just didn't want to complicate the question with it. (That and it involves Ant variables and wouldn't belong in JiG.)
Joined: May 06, 2004
Here's a generic way to match a single XML element:Just replace "module" with the name of the tag you want to find (in Ant, you should be able to replace it with a variable). If the element can be nested within itself, this will only find the innermost one.
Ken, I'm sorry to say your suggestion won't work in any regex flavor. The square brackets define a character class, which only matches one character. The initial caret means "complement of", so [^(module)] matches any single character that is not one of '(', 'm', 'o', 'd', 'u', 'l', 'e', or ')'.