aspose file tools*
The moose likes Java in General and the fly likes treating variable as regex Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "treating variable as regex" Watch "treating variable as regex" New topic
Author

treating variable as regex

Ankit Chandrawat
Ranch Hand

Joined: Jan 03, 2008
Posts: 87
Hi,

I have a requirement wherein I need to search a string if it has a word repeated thrice consecutively (for example: hello are are are you there, this should be treated as hello are you there). Is there any way by which I can treat the word "are" as a regex. What I mean is to treat the variable as regex. Or is there any other better way to implement this thing ?

TIA,
Ankit
Harsha Smith
Ranch Hand

Joined: Jul 18, 2011
Posts: 287
is this okay?
Ankit Chandrawat
Ranch Hand

Joined: Jan 03, 2008
Posts: 87
Hi Harsha,

Thanks for the response. But the string there in the example demonstrated by you is a constant one. While in my case the String will be fetched from database, hence it can be anything. So, how do I choose the regex( "are are are" in your case) dynamically.
Harsha Smith
Ranch Hand

Joined: Jul 18, 2011
Posts: 287
instead of hard-coding the string, use a variable like

String s = value retrieved from the database;

StringBuilder sb = new StringBuilder(s);

for(int i =1; i<3; i++){
sb.append(" ");
sb.append(s);
}

String regex = sb.toString();

and s is the replacement
Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19696
    
  20

You need to use a capturing group, and then check if the contents of that group reappear. Allowing only whitespace between the words:
The (\\w+) part captures one single word. The \\s+ means one or more occurrences of whitespace. The \\1 means the exact same value as the captured word.
Replace the \\s+ with something else to also allow other characters; for instance, [\\s,]+ also allows commas.


SCJP 1.4 - SCJP 6 - SCWCD 5 - OCEEJBD 6
How To Ask Questions How To Answer Questions
Ankit Chandrawat
Ranch Hand

Joined: Jan 03, 2008
Posts: 87
Hi Rob,

Thanks for your suggestion and that works absolutely fine when I have a string of type "Hi how are are are you". But it doesnt return anything when I try for strings like "Hi How, are, are, are, you". How do I handle such cases.


Harsha Smith
Ranch Hand

Joined: Jul 18, 2011
Posts: 287
Do you want to replace all the strings that occur 3 or more consecutive times or only the target string?
Ankit Chandrawat
Ranch Hand

Joined: Jan 03, 2008
Posts: 87
I want to replace all the strings that occurs multiple times(more than once).For example:

i i am am am am here
should be rendered as i am here

i, i, i, am here
should be rendered as i, am here

i am here
should be rendered as i am here
Harsha Smith
Ranch Hand

Joined: Jul 18, 2011
Posts: 287
REGEX
Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19696
    
  20

If it's two or more you use a qualifier for that: "(\\w+)(\\s+\\1)+". The whitespace + repetition then is required one or more times.
If you want the comma inside the match, add that to the \\w+: "(\\w+,?). The ? makes the comma optional.

However, that will give problems with cases like "I, I, I am". The last "I" does not match the starting "I,", so replacing would give you "I, I am". Putting the comma with the whitespace (as I had already mentioned) will solve that; "I, I, I am" will become "I am", and "I, I, I, am" will become "I, am" because the last comma is not part of the match.
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 7801
    
  21

Ankit Chandrawat wrote:I want to replace all the strings that occurs multiple times(more than once).For example:

i i am am am am here
should be rendered as i am here...

Yes, but the problem is your rules aren't complete. Is this only for space-delimited words?
For example, what would you want to do with:
i i amamamam here
?

Winston


Isn't it funny how there's always time and money enough to do it WRONG?
Articles by Winston can be found here
Harsha Smith
Ranch Hand

Joined: Jul 18, 2011
Posts: 287
I need to search a string if it has a word repeated thrice consecutively

I want to replace all the strings that occurs multiple times(more than once)


In software development, the specs keep changing
Ankit Chandrawat
Ranch Hand

Joined: Jan 03, 2008
Posts: 87
space is the delimiter and about the rules, all it says is :

"  Words repeated multiple times consecutively should be considered as one"

now the definition of words can be:

am
,am
am,
,am,

the character "," is just an example of a special character. So, lets just replace the "word" with string.

which now converts it to

Strings repeated multiple times consecutively should be considered as one.
Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19696
    
  20

Use "(\\S+)" as the first part. Where "\\s" means whitespace, "\\S" means anything but whitespace.
Ankit Chandrawat
Ranch Hand

Joined: Jan 03, 2008
Posts: 87
Thanks Rob, that really worked. Just out of curiosity, is it possible to consider the String only once if it appears say 4 times. Here we are putting a limit to the multiplicity of the String.
Harsha Smith
Ranch Hand

Joined: Jul 18, 2011
Posts: 287
Can you tell me if this works?
Ankit Chandrawat
Ranch Hand

Joined: Jan 03, 2008
Posts: 87
Ya Harsha, this one worked really well. Thanks a lot.
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 7801
    
  21

Ankit Chandrawat wrote:Ya Harsha, this one worked really well. Thanks a lot.

A good lesson. Regexes are great, but not for everything. Sometimes the simplest is the best.

Winston
Harsha Smith
Ranch Hand

Joined: Jul 18, 2011
Posts: 287
What is more understandable? complex regex or regular java coding with simple regex? what is easier to maintain?
Ankit Chandrawat
Ranch Hand

Joined: Jan 03, 2008
Posts: 87
I have always been a regular Java guy. Complicated regex makes me sort of uncomfortable. But the great thing is we have great solutions available in both the forms.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: treating variable as regex