This week's book giveaway is in the OCMJEA forum.
We're giving away four copies of OCM Java EE 6 Enterprise Architect Exam Guide and have Paul Allen & Joseph Bambara on-line!
See this thread for details.
The moose likes Java in General and the fly likes Regular expression to search for words not beginning with a single quote and with or without spaces Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of OCM Java EE 6 Enterprise Architect Exam Guide this week in the OCMJEA forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "Regular expression to search for words not beginning with a single quote and with or without spaces" Watch "Regular expression to search for words not beginning with a single quote and with or without spaces" New topic
Author

Regular expression to search for words not beginning with a single quote and with or without spaces

Rama Krishna
Ranch Hand

Joined: Oct 16, 2007
Posts: 110
In a sql where clause constructed without the alias names, the query would be something like

where name = '123' or name is null and name = 'name';


and I am trying to achieve the following,

where aliasObjectName.name = '123' or aliasObjectName.name is null and aliasObjectName.name = 'name';


whereas the following code fails the test case making it

where aliasObjectName.name = '123' or aliasObjectName.name is null and aliasObjectName.name = 'aliasObjectName.name';


in the last replacement which is wrong, I would appreciate if someone could help me with the right regular expression and the syntax reasoning.

I am assuming that (?') in the beginning of the regular expression should avoid looking for those words beginning with ' or single quote characters, i.e., searching for var(?!=) would match var in 'something=var' but not the var in 'var=something'

Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18765
    
  40

I am assuming that (?') in the beginning of the regular expression should avoid looking for those words beginning with ' or single quote characters,


A "(?!')" is a zero length negative look ahead. Basically, it is saying from the point in the match, look forwards, and make sure that the match doesn't happen. And BTW, it is zero-width, meaning it won't be counted toward the match.

i.e., searching for var(?!=) would match var in 'something=var' but not the var in 'var=something'


yes.

[EDIT: Sorry messed up the regex terms... fixed what I said.]

Henry


Books: Java Threads, 3rd Edition, Jini in a Nutshell, and Java Gems (contributor)
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18765
    
  40

The beginning of this pattern...



Doesn't make much sense. You are basically saying look ahead to make sure that the next character isn't a single quote, but from that point the next character is the first character of the match, which is a word character.

Perhaps you wanted a look behind instead?

Henry
Rama Krishna
Ranch Hand

Joined: Oct 16, 2007
Posts: 110
so it should be more like



suggesting negative look behind without ' single quotation marks, but this fails the test case too. I am missing something more here.>
Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19684
    
  20

\\w{1,}\\s?(?==))(?<!')

That basically means:
1 or more word characters (please use \\w+ - that's what the + is for!)
followed by 1 or 0 whitespace characters
followed by a = but not preceded by a '

Try moving that lookbehind to the start. After all, you want the \\w+ to not have a ' in front of it: (?><!')(\\w+)\\s*(?==)
Notice how I use \\s* instead of \\s? - this allows more than 1 space should you have those by accident. Also, since the whitespace is part of your match, make sure to use group(1) to find the \\w+, and include a single space. That way you normalize it as well.>


SCJP 1.4 - SCJP 6 - SCWCD 5 - OCEEJBD 6
How To Ask Questions How To Answer Questions
Rama Krishna
Ranch Hand

Joined: Oct 16, 2007
Posts: 110
Oops, fails all test cases



Result:

(name='asf', after replacement becomes, (name='asf'
( name = 'name' or name is null), after replacement becomes, ( name = 'name' or name is null)
(name is null), after replacement becomes, (name is null)
name = ' name = ', after replacement becomes, name = ' name = '
( name='' and name like '%name%'), after replacement becomes, ( name='' and name like '%name%')
(name = 'name='), after replacement becomes, (name = 'name=')


The regular expression (?<!')(\\w+)\\s*(?==) fails as well.>
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18765
    
  40

The regular expression (?><!')(\\w+)\\s*(?==) fails as well.>


Unfortunately, something in the posting system is adding ">" signs -- and messing up the regex. So, we can't tell what is the regex that you actually used. This includes the regexes posted by others, so hopefully, you didn't copy it blindly.

Henry
Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19684
    
  20

That's because the start should be (?<!'), without the >.

That still misses things like "name is null" or "name like '%name'" but that's because you're simply not looking for "is" or like.
In the end I think that regular expressions are not the way to solve this issue, because it will be hard to find a regular expression for all solutions. I tried "(?<!')(\\w+)(\\s*)(?==|is|like|in)" but that still failed for cases "name = ' name = '" and "name = 'name='" - because the second occurrences in fact match as well. Your real requirement is "replace any field" which can be translated roughly as "replace anything that is not a string literal or operator".
Rama Krishna
Ranch Hand

Joined: Oct 16, 2007
Posts: 110
Henry, I think I knew what Rob was suggesting, when ever the regular expression (?<!') went into the CODE tags, it would become (?><!') there is no problem if you show that >< symbol as < in the CODE tags, I exactly see what I want to show and must be the same when others show too!

Rob Prime wrote:That's because the start should be (?<!'), without the >.


you meant lesser than sign right, meaning negative look behind because I don't see the greater than sign in your above quote.

That still misses things like "name is null" or "name like '%name'" but that's because you're simply not looking for "is" or like.
In the end I think that regular expressions are not the way to solve this issue, because it will be hard to find a regular expression for all solutions. I tried "(?<!')(\\w+)(\\s*)(?==|is|like|in)" but that still failed for cases "name = ' name = '" and "name = 'name='" - because the second occurrences in fact match as well. Your real requirement is "replace any field" which can be translated roughly as "replace anything that is not a string literal or operator".


This regular expression does not match even those that the regular expression (\w+\s*)(?==) matched Are you saying there is no solution to this problem via regular expressions?

I am able to search for words (may or may not have spaces) ending with '=' sign, all I want to add is that such words followed by '=' sign should not have a single quotation marks before (spaces could be allowed)

But the regular expression fails!

Any other solutions to this problem?>
Piet Verdriet
Ranch Hand

Joined: Feb 25, 2006
Posts: 266
Try:

Rama Krishna
Ranch Hand

Joined: Oct 16, 2007
Posts: 110
Great, I never knew you could replace a string so easily that way.

The following string fails the query constructed by = signs, my query contains only = sign.

String s = "where name = '123 =' or name is null and name = ' name=' ;";
s = s.replaceAll("[^\\s']++(?=\\s*+(?:=|\\bis\\b))", "aliasObjectName.$0");

I ended up using \w++(?=\s*=)(?=[^']*(?:'[^']*'[^']*)*$) from another forum:



Result:
(name='asf', after replacement, (name='asf'
( name = 'name' or name is null), after replacement, ( aliasObjectName.name = 'name' or name is null)
(name is null), after replacement, (name is null)
name = ' name = ', after replacement, aliasObjectName.name = ' name = '
( name='' and name like '%name%'), after replacement, ( name='' and name like '%name%')
(name = 'name='), after replacement, (aliasObjectName.name = 'name=')
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Regular expression to search for words not beginning with a single quote and with or without spaces