aspose file tools*
The moose likes Java in General and the fly likes hard times with simple regex search mecanism Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "hard times with simple regex search mecanism" Watch "hard times with simple regex search mecanism" New topic
Author

hard times with simple regex search mecanism

Michel Legris
Greenhorn

Joined: Feb 21, 2012
Posts: 19
1-
String Str= "abc xxxx def";
String Rgx="abc .*? def";
System.out.println("Result="+Str.matches(Rgx));
--> This return true...this is exactly what I want but it's NOT woking all the time

2-
String Str= "[abc] xxxx def"; //the original string contain [ ]
String Rgx="[abc] .*? def";
System.out.println("Result="+Str.matches(Rgx));
--> *** THIS RETURN FALSE INSTEAD OF TRUE AS EXPECPED ***

3-
String Str= "{abc xxxx def"; //the original string contain a { character
String Rgx="{abc .*? def";
System.out.println("Result="+Str.matches(Rgx));
--> *** THIS RETURN AN EXCEPTION *** : java.util.regex.PatternSyntaxException: Illegal repetition

I need to find a way to put some Wildcard inside a string and getting correct result without having incorrect result or exception.
(The string can be **anything**, the expression to search is the same but with some part(s) replaced with on or MORE wildcards like in my example)
Is anybody know the proper way to do this kind of text search with regular expression???

HELP!!!
Darryl Burke
Bartender

Joined: May 03, 2008
Posts: 4547
    
    5

Don't guess at regex syntax, that just won't work. Here are a couple of good learning resources:
  • Oracle tutorial: Regular Expressions
  • Regular-Expressions.info


  • luck, db
    There are no new questions, but there may be new answers.
    Michel Legris
    Greenhorn

    Joined: Feb 21, 2012
    Posts: 19
    heu...

    The main problem is dealing with reserved regex word in my search expression. Is escaping all possibles "special" characters is the only option? There are tons of thoses characters and this approach make no sense. The string is also provided by the user in my applciation and can be anything (abc , def and xxxx can be anything)

    So nobody can explain how this could be done? I have already spent several hours trying diffrent things and analyzing the doc. Nobody know what the expression should be or how to manipulate the string in order to prevent problem with special characters or how to make my 3 examples working?
    Henry Wong
    author
    Sheriff

    Joined: Sep 28, 2004
    Posts: 18855
        
      40

    Michel Legris wrote:
    The main problem is dealing with reserved regex word in my search expression. Is escaping all possibles "special" characters is the only option? There are tons of thoses characters and this approach make no sense. The string is also provided by the user in my applciation and can be anything (abc , def and xxxx can be anything)

    So nobody can explain how this could be done? I have already spent several hours trying diffrent things and analyzing the doc. Nobody know what the expression should be or how to manipulate the string in order to prevent problem with special characters or how to make my 3 examples working?


    Not sure of the issue.... Of course, you need to escape the special characters. How do you expect the regex engine to magically know which ones are supposed to be interpreted and which ones are suppsed to be ignored?

    The regex engine does provde a method to turn off all special characters ... the Pattern.quote() method ... but there isn't one to turn off some but not others. If you want that, you will need to parse it and turn it off (likely with the Pattern.quote() method on substrings) yourself.

    Henry

    Books: Java Threads, 3rd Edition, Jini in a Nutshell, and Java Gems (contributor)
    Michel Legris
    Greenhorn

    Joined: Feb 21, 2012
    Posts: 19
    It finally works!!!

    I have already tried Pattern.quote but the problem was the quote was done to the entire expression ....duh...(make no sense)

    The solution is :
    String expression=Pattern.quote(FirstPart)+".*" +Pattern.quote(SecondPart);

    The first and the second part was done by splitting on our homebrew wildcard {*} and by substring each part to extract the text before and after the wildcards.

    So Pattern.quote() was what i was looking for!

    A big big thanks to Henry Wong
    Darryl Burke
    Bartender

    Joined: May 03, 2008
    Posts: 4547
        
        5

    Michel Legris wrote:The main problem is dealing with reserved regex word in my search expression.

    Why should that be a problem?

    Michel Legris wrote:Is escaping all possibles "special" characters is the only option?

    No, but you would have known that if you'd gone through the learning resources I gave you links for.

    Michel Legris wrote:There are tons of thoses characters

    No, there are only a few.

    Michel Legris wrote:and this approach make no sense.

    So use whatever approach does make sense.

    Michel Legris wrote:The string is also provided by the user in my applciation and can be anything (abc , def and xxxx can be anything)

    The special sequences for start and end of quotation could be useful there.

    Michel Legris wrote:So nobody can explain how this could be done? I have already spent several hours trying diffrent things and analyzing the doc.

    Let's see some of those attempts, improved by what you learn from those two tutorials and the Pattern API, and we can take it from there.

    Michel Legris wrote:Nobody know what the expression should be or how to manipulate the string in order to prevent problem with special characters or how to make my 3 examples working?

    I hope you're not implying that someone should just hand you a solution. You wouldn't learn anything that way, and would be back asking for more of the same the next time you bump into a situation that requires a regex solution. Don't you think it's more worthwhile to progress toward a better understanding and a solution that's all your own?
    Henry Wong
    author
    Sheriff

    Joined: Sep 28, 2004
    Posts: 18855
        
      40

    Darryl Burke wrote:
    Michel Legris wrote:The string is also provided by the user in my applciation and can be anything (abc , def and xxxx can be anything)

    The special sequences for start and end of quotation could be useful there.


    As an FYI... This is actually what is done internally by the Pattern.quote() method. See this code....



    which generates this output ....

    .*?
    \Q.*?\E
    \Q\Q.*?\E\\E\Q\E
    \Q\Q\Q.*?\E\\E\Q\\E\\E\Q\Q\E\\E\Q\E


    As you can see...

  • 1, All it does is add a "\Q" before and a "\E" after the regex.
  • 2. Unless the regex has an "\E", in which case it will use "\Q" and "\E" to quote the components around the "\E", and use the backslash to escape the "\E".
  • 3. Unless the regex has an escape "\E", meaning an "\\E", in which case, it seems to fail miserably.


  • [EDIT: In retrospect, I think the third one is correct. It just looks like a mess.]

    Henry

    Michel Legris
    Greenhorn

    Joined: Feb 21, 2012
    Posts: 19
    Thanks again Henry for the precision.

    >How do you expect the regex engine to magically know which ones are supposed to be interpreted and which ones are suppsed to be ignored?

    This is the the only thing needed to make me realise my mistake in my previous attempts to use quote(). This really make sense but it never come into my mind to split the expression in 3 parts to apply quote only on the non regular expression part.
    Stephan van Hulst
    Bartender

    Joined: Sep 20, 2010
    Posts: 3647
        
      16

    I'm curious though, can you tell us why you're using regular expressions in the first place? In general, I find that people tend to overuse them once they gain an understanding of them, when plain old simple Java code is more appropriate and more readable.
    Michel Legris
    Greenhorn

    Joined: Feb 21, 2012
    Posts: 19
    Its a very particular use.... i'm creating an automated testing tool...In my application the user can override the title of the window like "Notepad - {*}" or "Modify Client #{*} - {*} blablabla" This way the user can adapt the control recognition by text for their needs. Plain java solution cannot be used and it's the only part I need to use regular expression.

    BTW I'm not a newbee in regular expression, I already work 10 years in perl mostly to do things based on regular expression. But this was a long time ago and forgot almost everything since that time.

    Good new is that everything works perfectly now!
     
    Don't get me started about those stupid light bulbs.
     
    subject: hard times with simple regex search mecanism