This week's book giveaway is in the OCPJP forum.
We're giving away four copies of OCA/OCP Java SE 7 Programmer I & II Study Guide and have Kathy Sierra & Bert Bates on-line!
See this thread for details.
The moose likes Beginning Java and the fly likes [regex help]Find EXACT characters in a String. Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of OCA/OCP Java SE 7 Programmer I & II Study Guide this week in the OCPJP forum!
JavaRanch » Java Forums » Java » Beginning Java
Bookmark "[regex help]Find EXACT characters in a String." Watch "[regex help]Find EXACT characters in a String." New topic
Author

[regex help]Find EXACT characters in a String.

Quiddo Quitch
Ranch Hand

Joined: Apr 21, 2008
Posts: 38
Ok, i got this so far ...



With that code i got a match, but i ONLY want to find a match if the String matches the EXACTS characters in the Pattern ...

in this case i got, 'w' 'a' 't' 'i', 'w' 'a' 't', are in water, but 'i' NOT, so i don't want to have a match !

i don't know how to acomplish this

hope you can help me, thanks in advance
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42371
    
  64
What this regexp does is to check whether any of the characters w, a, t and i occurs anywhere in the string - which w, a and t do, so there's a match.

If you want to check that the complete string consists of nothing but those characters, then you need to a) specify that the complete string should be matched, and b) that you want to match more than a single character.

Read the java.util.regex.Pattern javadocs about the special characters "^", "$" and "*". (A little hint: Each of those is needed exactly once in the regexp.)


Ping & DNS - my free Android networking tools app
Quiddo Quitch
Ranch Hand

Joined: Apr 21, 2008
Posts: 38
yes, i know a little about Regex and i also read this sometime ago

http://java.sun.com/docs/books/tutorial/essential/regex/index.html

if i asked here if because i searched @ yahoo a lot, i readed some papers and i still couldn't do it ... if you know how to do it can you tell me please ?

thanks.
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42371
    
  64
I'd prefer if you tried to find out yourself; you'll learn more that way, and that's really what JavaRanch is about.

From what you've read, what do you understand about the 3 special characters I mentioned?
David Newton
Author
Rancher

Joined: Sep 29, 2008
Posts: 12617

In general we don't just hand out answers on a plate.

Essentially any regex tutorial explains what the "[]" characters do in a regex--re-examining their purpose may help you understand why your regex isn't doing what you want.
Quiddo Quitch
Ranch Hand

Joined: Apr 21, 2008
Posts: 38
hi David Newton, i was just giving an example with [], so everyone who can help me will quickly understand my problem

i think i got it, thank you very much Ulf Dittmer



one more question ... is that how it has to be, or it could be improved ?
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42371
    
  64
It had better get improved, as it doesn't work :-) That expression doesn't report "Found" for "wat", for example.

Removing the square brackets was a wrong move - you need those. But judging by how you arranged the "$" and "*" characters I don't think you understand yet what those do.
Quiddo Quitch
Ranch Hand

Joined: Apr 21, 2008
Posts: 38
mmmnn it doesn't work quite right neither


that should be a possitive found, because it has 'r' 'e' and 't' of water ... this is so frustrating
Rusty Shackleford
Ranch Hand

Joined: Jan 03, 2006
Posts: 490
I don't know how useful this will be to you, but when I have trouble building a regular expression I draw its equivalent, a finite state automaton. Then I reduce it if possible and convert it to whatever languages regex I need to.

It is not very difficult and can produce regexs for some very complex patterns easily.


"Computer science is no more about computers than astronomy is about telescopes" - Edsger Dijkstra
David Newton
Author
Rancher

Joined: Sep 29, 2008
Posts: 12617

Using a visual regex tool can help, too. I usually use regex coach, but that's out of habit--I'm sure there are a million others. I don't know if it handles Java regex syntax, but if it doesn't, it's probably close enough to work it out.
Quiddo Quitch
Ranch Hand

Joined: Apr 21, 2008
Posts: 38
Ulf Dittmer wrote:It had better get improved, as it doesn't work :-) That expression doesn't report "Found" for "wat", for example.

Removing the square brackets was a wrong move - you need those. But judging by how you arranged the "$" and "*" characters I don't think you understand yet what those do.


Do you really have an answer ?

i asked the same question @ the official Sun forum, and ppl say things like this ...

"... there's no syntax in regular expressions to accomplish this. Try turning your string into a character array, sorting it, and then do a simple comparison ..."
Rusty Shackleford
Ranch Hand

Joined: Jan 03, 2006
Posts: 490
Quiddo Quitch wrote:
Ulf Dittmer wrote:It had better get improved, as it doesn't work :-) That expression doesn't report "Found" for "wat", for example.

Removing the square brackets was a wrong move - you need those. But judging by how you arranged the "$" and "*" characters I don't think you understand yet what those do.


Do you really have an answer ?

i asked the same question @ the official Sun forum, and ppl say things like this ...

"... there's no syntax in regular expressions to accomplish this. Try turning your string into a character array, sorting it, and then do a simple comparison ..."


Whoever said that is wrong. You can take that suggestion, but it is an inefficient solution.
Quiddo Quitch
Ranch Hand

Joined: Apr 21, 2008
Posts: 38
yes is an inefficient solution for me too

i want so bad to see it work ...
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42371
    
  64
Yes, I have a solution. The regexp is 9 characters long, 6 of which you already had in your first post. The other 3 I mentioned in my reply.
Quiddo Quitch
Ranch Hand

Joined: Apr 21, 2008
Posts: 38
Ulf Dittmer wrote:Yes, I have a solution. The regexp is 9 characters long, 6 of which you already had in your first post. The other 3 I mentioned in my reply.


lol ! i hope you are not having fun with me ... did you tried ? for example if you search "ret" gives you a possitive result ?

i tried every possible combination of what you said

and i know whats ^ and * means, but i don't know quite wll what $ does, i searched @ yahoo and the informations about $ was very poor
Rusty Shackleford
Ranch Hand

Joined: Jan 03, 2006
Posts: 490
Quiddo Quitch wrote:
Ulf Dittmer wrote:Yes, I have a solution. The regexp is 9 characters long, 6 of which you already had in your first post. The other 3 I mentioned in my reply.


lol ! i hope you are not having fun with me ... did you tried ? for example if you search "ret" gives you a possitive result ?

i tried every possible combination of what you said

and i know whats ^ and * means, but i don't know quite wll what $ does, i searched @ yahoo and the informations about $ was very poor


Ulf is not making fun of you, he is trying to point you in the right direction. It is against the spirit of the message board to give out answers because you won't learn anything that way.

Have you gone through this?

I struggled with regular expressions until I studied automata theory. Knowing the theoretical background is very useful despite what some say.
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42371
    
  64
Quiddo Quitch wrote:for example if you search "ret" gives you a possitive result ?

Well, that sounds like a different problem. What I had understood was that the string being matched should consist only of characters in the pattern; but apparently that's not what you're asking?

You're looking to match strings that include all characters in the pattern at least once, but can also contain other characters?
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 39478
    
  28
No need to search Google or anything: just search the Java™ Tutorials.
Quiddo Quitch
Ranch Hand

Joined: Apr 21, 2008
Posts: 38
^ inside a [] means NOT

[^abc] = search everything EXCEPT a,b,c,

so it HAS to be here

^[]

* to find matches one or more time and it goes to the end of the RE

^[]*

$ i dont know quite well, match the end of something ?
Quiddo Quitch
Ranch Hand

Joined: Apr 21, 2008
Posts: 38
Campbell Ritchie wrote:No need to search Google or anything: just search the Java™ Tutorials.


i did it, is in my second post, no luck in there ... i keep reading anyway ...
Rusty Shackleford
Ranch Hand

Joined: Jan 03, 2006
Posts: 490
Quiddo Quitch wrote:
Campbell Ritchie wrote:No need to search Google or anything: just search the Java™ Tutorials.


i did it, is in my second post, no luck in there ... i keep reading anyway ...


It talks specifically about $, what didn't you understand?
Quiddo Quitch
Ranch Hand

Joined: Apr 21, 2008
Posts: 38
Ulf Dittmer wrote:
Quiddo Quitch wrote:for example if you search "ret" gives you a possitive result ?

Well, that sounds like a different problem. What I had understood was that the string being matched should consist only of characters in the pattern; but apparently that's not what you're asking?

You're looking to match strings that include all characters in the pattern at least once, but can also contain other characters?


yes ...

i want to match a String that contains all of those characters regardless of order or amount BUT if a character is not in the string it should be NO matches ...

for example :

t match "water"
ret match "water"
retaw match "water"
wret match "water"
BUT
'i' DOESNT match "water" because 'i' is not in the string "water"
watzer DOESNT match "water" because 'z' is not in the string "water"

that is how the regex should work for what i need ...


it could be done in regex ?
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 39478
    
  28
Quiddo Quitch wrote:i did it, is in my second post, no luck in there ... i keep reading anyway ...
You will find $ in this section.
Rusty Shackleford
Ranch Hand

Joined: Jan 03, 2006
Posts: 490
Yes it can be done.

Show what you have and explain what you think it means and maybe we can move you forward.
Quiddo Quitch
Ranch Hand

Joined: Apr 21, 2008
Posts: 38
Campbell Ritchie wrote:
Quiddo Quitch wrote:i did it, is in my second post, no luck in there ... i keep reading anyway ...
You will find $ in this section.


damn, i didn't saw it, thanks !

by the way, any advice ? should i leave regex alone and do it in a more simplistic manner ?
Quiddo Quitch
Ranch Hand

Joined: Apr 21, 2008
Posts: 38
Rusty Shackleford wrote:Yes it can be done.

Show what you have and explain what you think it means and maybe we can move you forward.


what i have is in the first post, and what i think is above too, if your knowledge of regex is advance please PLEASE help me a little ... i asked the same question in 5 different forums and nobody was even close, not even in the official forum.

i tried everything im so tired
Rusty Shackleford
Ranch Hand

Joined: Jan 03, 2006
Posts: 490
It looks like you are blindly trying combinations, which will not work.

Write out a general algorithm of the problem and then apply it to your specific case

Read first character.

1. If character does not match one of these characters [ <insert whatever characters here> ] then stop with failure, else go to 2

2. If character does match one of the characters in 2 then ???

3. Read next character, then ??? What if you read all characters and are in this state? From here you stop with success, or go to another state.
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18916
    
  40

for example :

t match "water"
ret match "water"
retaw match "water"
wret match "water"
BUT
'i' DOESNT match "water" because 'i' is not in the string "water"
watzer DOESNT match "water" because 'z' is not in the string "water"


Basically, you want all the characters in the first string to appear in the second string.

Do you really have an answer ?

i asked the same question @ the official Sun forum, and ppl say things like this ...

"... there's no syntax in regular expressions to accomplish this. Try turning your string into a character array, sorting it, and then do a simple comparison ..."


Regex doesn't do AND operations very well -- so I am not surprised if someone told you this can't be done with a single regex match. However, there is a (single regex) solution here... but, as Rusty mentioned...

It looks like you are blindly trying combinations, which will not work.


The solution is not simple (it requires zero length look aheads). And based on what you have done so far, you probably need to read up on your regex a bit more to understand it. I can't even give you a hint in the right direction...

Henry


Books: Java Threads, 3rd Edition, Jini in a Nutshell, and Java Gems (contributor)
Quiddo Quitch
Ranch Hand

Joined: Apr 21, 2008
Posts: 38
Henry Wong wrote:
for example :

t match "water"
ret match "water"
retaw match "water"
wret match "water"
BUT
'i' DOESNT match "water" because 'i' is not in the string "water"
watzer DOESNT match "water" because 'z' is not in the string "water"


Basically, you want all the characters in the first string to appear in the second string.

Do you really have an answer ?

i asked the same question @ the official Sun forum, and ppl say things like this ...

"... there's no syntax in regular expressions to accomplish this. Try turning your string into a character array, sorting it, and then do a simple comparison ..."


Regex doesn't do AND operations very well -- so I am not surprised if someone told you this can't be done with a single regex match. However, there is a (single regex) solution here... but, as Rusty mentioned...

It looks like you are blindly trying combinations, which will not work.


The solution is not simple (it requires zero length look aheads). And based on what you have done so far, you probably need to read up on your regex a bit more to understand it. I can't even give you a hint in the right direction...

Henry


why you cant give me a hint ?

so in resume, i should do it with characters arrays ?
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18916
    
  40

why you cant give me a hint ?


Because I don't know how to explain it... short of asking you to read up on regex. Anyway, here is your solution with a regex that works.



Henry
Quiddo Quitch
Ranch Hand

Joined: Apr 21, 2008
Posts: 38
Henry Wong wrote:
why you cant give me a hint ?


Because I don't know how to explain it... short of asking you to read up on regex. Anyway, here is your solution with a regex that works.



Henry


OMG !!! that is almost what i want ...

(?=.*w)(?=.*w)(?=.*w)(?=.*w)" --> true, but it should be false because "water" has only ONE 'w'

anyway, thank you VERY much sir
tom mickey
Greenhorn

Joined: Aug 02, 2008
Posts: 9
Ulf Dittmer wrote:It had better get improved, as it doesn't work :-) That expression doesn't report "Found" for "wat", for example.

Removing the square brackets was a wrong move - you need those. But judging by how you arranged the "$" and "*" characters I don't think you understand yet what those do.


For the OP's original problem(not the second one), can we use the pattern:

I am fairly new to regex myself but i find this seems to work without the brackets.
Is there some condition under which this pattern would fail?
Can you explain why not using the square brackets was a wrong move?
Thanks.
David Newton
Author
Rancher

Joined: Sep 29, 2008
Posts: 12617

That pattern would mean the string *must* start with "wat" precisely, which I don't believe is the requirement.
tom mickey
Greenhorn

Joined: Aug 02, 2008
Posts: 9
David Newton wrote:That pattern would mean the string *must* start with "wat" precisely, which I don't believe is the requirement.

oops.sorry didn't read the requirements properly. :p
In that case can we use this pattern without the brackets like this:

Thanks.
David Newton
Author
Rancher

Joined: Sep 29, 2008
Posts: 12617

Did you try it and see? What will the first asterisk do?
tom mickey
Greenhorn

Joined: Aug 02, 2008
Posts: 9
David Newton wrote:Did you try it and see? What will the first asterisk do?

Yes although i haven't tested it exhaustively. It finds strings that contain the exact sequence "wati".
The first asterisk obviously is an wildcard to accept any characters.
Is there any other way in which this can be written using the square brackets?
Thanks.
David Newton
Author
Rancher

Joined: Sep 29, 2008
Posts: 12617

The first asterisk obviously is an wildcard to accept any characters.

Here, by "obviously", I think you mean something other than its traditional meaning, unless you had a typo in your original regex.

If the first "*" accepts any characters, what's the ".*" part do?
tom mickey
Greenhorn

Joined: Aug 02, 2008
Posts: 9
David Newton wrote:
The first asterisk obviously is an wildcard to accept any characters.

Here, by "obviously", I think you mean something other than its traditional meaning, unless you had a typo in your original regex.

If the first "*" accepts any characters, what's the ".*" part do?

Well is it wrong?. But i guess ".*" would be appropriate with the pattern being:

But * alone seems to do work for almost all strings i have given as input so far.
Edit:
Anyway my question is, is it possible to write it using someother (using square brackets) way?
Thanks.
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18916
    
  40

As I already mentioned, you really need to understand regexes, if you are going to use it. Otherwise, you won't be able to modify, or fix any bugs, with the regex code that you are using.

OMG !!! that is almost what i want ...

(?=.*w)(?=.*w)(?=.*w)(?=.*w)" --> true, but it should be false because "water" has only ONE 'w'


Yes, I made the assumption that it should only check for a letter (and not count them). Modifying the regex to count a particular letter isn't too difficult -- it is a relatively simple modification. If you read up on regexes, to the point that you understand how the regex that I provided works, you should have no problems modifying it to count particular letters.

Henry
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: [regex help]Find EXACT characters in a String.