It's not a secret anymore!*
The moose likes Java in General and the fly likes Anything wrong this regular expression Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Anything wrong this regular expression" Watch "Anything wrong this regular expression" New topic
Author

Anything wrong this regular expression

Zee Ho
Ranch Hand

Joined: Jul 20, 2004
Posts: 128
Rancher :
I want use apache RE to parse the string like

and get
XXX : YYY aaa : bbb as the result (XXX YYYY can be any character even japanese or chinese)

I write the regular expression like (<span>([^:]* : [^:]*)</span> *
it only give me the aaa : bbb, it confuse me a lot, how to implement this in right way?


SCJP 1.4<br />SCWCD 1.3<br />SCJD<br />SCBCD<br />IBM Xml Cert in progress
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18826
    
  40

Hard to tell what the Regex is -- since you forgot to turn off the smilies. But from what I can read, it looks like you are using a qualifier at the end.

Anyway... When a subgroup matches multiple times, in a single match, due to a qualifier, then only the last match is retained. If you want all matches of a particular subgroup, I suggest you remove the qualifier, and iterate through multiple times with the find() method.

[EDIT: Sorry, just realized you are not using the built-in Regex engine. Hopefully, Apache has an equivalent that allows the iteration.]

Henry
[ May 11, 2006: Message edited by: Henry Wong ]

Books: Java Threads, 3rd Edition, Jini in a Nutshell, and Java Gems (contributor)
Zee Ho
Ranch Hand

Joined: Jul 20, 2004
Posts: 128


still only the bbb : ccc

how to get both?
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18826
    
  40

Please turn off the smilies -- it is a checkbox at the very bottom of the post page.

Henry
Zee Ho
Ranch Hand

Joined: Jul 20, 2004
Posts: 128
I resolved the problem with this regular expression

[^<]* : [^>]*

any better idea?
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18826
    
  40

Originally posted by Zee Ho:
I resolved the problem with this regular expression

[^<]* : [^>]*

any better idea?


It would depend on how strict you want the search. For example, if you want exactly one ":". And it must be between "<span>" and "</span>, then you could do this...



If you also want to prevent the "<" and ">", while at the same time, prevent matching the case where the ":" is missing, then you could do this...



Henry
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Anything wrong this regular expression