Win a copy of Think Java: How to Think Like a Computer Scientist this week in the Java in General forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

treating variable as regex

 
Ankit Chandrawat
Ranch Hand
Posts: 88
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I have a requirement wherein I need to search a string if it has a word repeated thrice consecutively (for example: hello are are are you there, this should be treated as hello are you there). Is there any way by which I can treat the word "are" as a regex. What I mean is to treat the variable as regex. Or is there any other better way to implement this thing ?

TIA,
Ankit
 
Harsha Smith
Ranch Hand
Posts: 287
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
is this okay?
 
Ankit Chandrawat
Ranch Hand
Posts: 88
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Harsha,

Thanks for the response. But the string there in the example demonstrated by you is a constant one. While in my case the String will be fetched from database, hence it can be anything. So, how do I choose the regex( "are are are" in your case) dynamically.
 
Harsha Smith
Ranch Hand
Posts: 287
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
instead of hard-coding the string, use a variable like

String s = value retrieved from the database;

StringBuilder sb = new StringBuilder(s);

for(int i =1; i<3; i++){
sb.append(" ");
sb.append(s);
}

String regex = sb.toString();

and s is the replacement
 
Rob Spoor
Sheriff
Pie
Posts: 20527
54
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You need to use a capturing group, and then check if the contents of that group reappear. Allowing only whitespace between the words:
The (\\w+) part captures one single word. The \\s+ means one or more occurrences of whitespace. The \\1 means the exact same value as the captured word.
Replace the \\s+ with something else to also allow other characters; for instance, [\\s,]+ also allows commas.
 
Ankit Chandrawat
Ranch Hand
Posts: 88
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Rob,

Thanks for your suggestion and that works absolutely fine when I have a string of type "Hi how are are are you". But it doesnt return anything when I try for strings like "Hi How, are, are, are, you". How do I handle such cases.


 
Harsha Smith
Ranch Hand
Posts: 287
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Do you want to replace all the strings that occur 3 or more consecutive times or only the target string?
 
Ankit Chandrawat
Ranch Hand
Posts: 88
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I want to replace all the strings that occurs multiple times(more than once).For example:

i i am am am am here
should be rendered as i am here

i, i, i, am here
should be rendered as i, am here

i am here
should be rendered as i am here
 
Harsha Smith
Ranch Hand
Posts: 287
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
REGEX
 
Rob Spoor
Sheriff
Pie
Posts: 20527
54
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If it's two or more you use a qualifier for that: "(\\w+)(\\s+\\1)+". The whitespace + repetition then is required one or more times.
If you want the comma inside the match, add that to the \\w+: "(\\w+,?). The ? makes the comma optional.

However, that will give problems with cases like "I, I, I am". The last "I" does not match the starting "I,", so replacing would give you "I, I am". Putting the comma with the whitespace (as I had already mentioned) will solve that; "I, I, I am" will become "I am", and "I, I, I, am" will become "I, am" because the last comma is not part of the match.
 
Winston Gutkowski
Bartender
Pie
Posts: 10417
63
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ankit Chandrawat wrote:I want to replace all the strings that occurs multiple times(more than once).For example:

i i am am am am here
should be rendered as i am here...

Yes, but the problem is your rules aren't complete. Is this only for space-delimited words?
For example, what would you want to do with:
i i amamamam here
?

Winston
 
Harsha Smith
Ranch Hand
Posts: 287
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I need to search a string if it has a word repeated thrice consecutively

I want to replace all the strings that occurs multiple times(more than once)


In software development, the specs keep changing
 
Ankit Chandrawat
Ranch Hand
Posts: 88
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
space is the delimiter and about the rules, all it says is :

"  Words repeated multiple times consecutively should be considered as one"

now the definition of words can be:

am
,am
am,
,am,

the character "," is just an example of a special character. So, lets just replace the "word" with string.

which now converts it to

Strings repeated multiple times consecutively should be considered as one.
 
Rob Spoor
Sheriff
Pie
Posts: 20527
54
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Use "(\\S+)" as the first part. Where "\\s" means whitespace, "\\S" means anything but whitespace.
 
Ankit Chandrawat
Ranch Hand
Posts: 88
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks Rob, that really worked. Just out of curiosity, is it possible to consider the String only once if it appears say 4 times. Here we are putting a limit to the multiplicity of the String.
 
Harsha Smith
Ranch Hand
Posts: 287
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Can you tell me if this works?
 
Ankit Chandrawat
Ranch Hand
Posts: 88
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ya Harsha, this one worked really well. Thanks a lot.
 
Winston Gutkowski
Bartender
Pie
Posts: 10417
63
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ankit Chandrawat wrote:Ya Harsha, this one worked really well. Thanks a lot.

A good lesson. Regexes are great, but not for everything. Sometimes the simplest is the best.

Winston
 
Harsha Smith
Ranch Hand
Posts: 287
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What is more understandable? complex regex or regular java coding with simple regex? what is easier to maintain?
 
Ankit Chandrawat
Ranch Hand
Posts: 88
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have always been a regular Java guy. Complicated regex makes me sort of uncomfortable. But the great thing is we have great solutions available in both the forms.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic