| Author |
Regular expression to search for words not beginning with a single quote and with or without spaces
|
Rama Krishna
Ranch Hand
Joined: Oct 16, 2007
Posts: 110
|
|
In a sql where clause constructed without the alias names, the query would be something like
where name = '123' or name is null and name = 'name';
and I am trying to achieve the following,
where aliasObjectName.name = '123' or aliasObjectName.name is null and aliasObjectName.name = 'name';
whereas the following code fails the test case making it
where aliasObjectName.name = '123' or aliasObjectName.name is null and aliasObjectName.name = 'aliasObjectName.name';
in the last replacement which is wrong, I would appreciate if someone could help me with the right regular expression and the syntax reasoning.
I am assuming that (?') in the beginning of the regular expression should avoid looking for those words beginning with ' or single quote characters, i.e., searching for var(?!=) would match var in 'something=var' but not the var in 'var=something'
|
 |
Henry Wong
author
Sheriff
Joined: Sep 28, 2004
Posts: 16811
|
|
I am assuming that (?') in the beginning of the regular expression should avoid looking for those words beginning with ' or single quote characters,
A "(?!')" is a zero length negative look ahead. Basically, it is saying from the point in the match, look forwards, and make sure that the match doesn't happen. And BTW, it is zero-width, meaning it won't be counted toward the match.
i.e., searching for var(?!=) would match var in 'something=var' but not the var in 'var=something'
yes.
[EDIT: Sorry messed up the regex terms... fixed what I said.]
Henry
|
Books: Java Threads, 3rd Edition, Jini in a Nutshell, and Java Gems (contributor)
|
 |
Henry Wong
author
Sheriff
Joined: Sep 28, 2004
Posts: 16811
|
|
The beginning of this pattern...
Doesn't make much sense. You are basically saying look ahead to make sure that the next character isn't a single quote, but from that point the next character is the first character of the match, which is a word character.
Perhaps you wanted a look behind instead?
Henry
|
 |
Rama Krishna
Ranch Hand
Joined: Oct 16, 2007
Posts: 110
|
|
so it should be more like
suggesting negative look behind without ' single quotation marks, but this fails the test case too. I am missing something more here.>
|
 |
Rob Spoor
Sheriff
Joined: Oct 27, 2005
Posts: 19232
|
|
\\w{1,}\\s?(?==))(?<!')
That basically means:
1 or more word characters (please use \\w+ - that's what the + is for!)
followed by 1 or 0 whitespace characters
followed by a = but not preceded by a '
Try moving that lookbehind to the start. After all, you want the \\w+ to not have a ' in front of it: (?><!')(\\w+)\\s*(?==)
Notice how I use \\s* instead of \\s? - this allows more than 1 space should you have those by accident. Also, since the whitespace is part of your match, make sure to use group(1) to find the \\w+, and include a single space. That way you normalize it as well.>
|
SCJP 1.4 - SCJP 6 - SCWCD 5
How To Ask Questions How To Answer Questions
|
 |
Rama Krishna
Ranch Hand
Joined: Oct 16, 2007
Posts: 110
|
|
Oops, fails all test cases
Result:
(name='asf', after replacement becomes, (name='asf'
( name = 'name' or name is null), after replacement becomes, ( name = 'name' or name is null)
(name is null), after replacement becomes, (name is null)
name = ' name = ', after replacement becomes, name = ' name = '
( name='' and name like '%name%'), after replacement becomes, ( name='' and name like '%name%')
(name = 'name='), after replacement becomes, (name = 'name=')
The regular expression (?<!')(\\w+)\\s*(?==) fails as well.>
|
 |
Henry Wong
author
Sheriff
Joined: Sep 28, 2004
Posts: 16811
|
|
The regular expression (?><!')(\\w+)\\s*(?==) fails as well.>
Unfortunately, something in the posting system is adding ">" signs -- and messing up the regex. So, we can't tell what is the regex that you actually used. This includes the regexes posted by others, so hopefully, you didn't copy it blindly.
Henry
|
 |
Rob Spoor
Sheriff
Joined: Oct 27, 2005
Posts: 19232
|
|
That's because the start should be (?<!'), without the >.
That still misses things like "name is null" or "name like '%name'" but that's because you're simply not looking for "is" or like.
In the end I think that regular expressions are not the way to solve this issue, because it will be hard to find a regular expression for all solutions. I tried "(?<!')(\\w+)(\\s*)(?==|is|like|in)" but that still failed for cases "name = ' name = '" and "name = 'name='" - because the second occurrences in fact match as well. Your real requirement is "replace any field" which can be translated roughly as "replace anything that is not a string literal or operator".
|
 |
Rama Krishna
Ranch Hand
Joined: Oct 16, 2007
Posts: 110
|
|
Henry, I think I knew what Rob was suggesting, when ever the regular expression (?<!') went into the CODE tags, it would become (?><!') there is no problem if you show that >< symbol as < in the CODE tags, I exactly see what I want to show and must be the same when others show too!
Rob Prime wrote:That's because the start should be (?<!'), without the >.
you meant lesser than sign right, meaning negative look behind because I don't see the greater than sign in your above quote.
That still misses things like "name is null" or "name like '%name'" but that's because you're simply not looking for "is" or like.
In the end I think that regular expressions are not the way to solve this issue, because it will be hard to find a regular expression for all solutions. I tried "(?<!')(\\w+)(\\s*)(?==|is|like|in)" but that still failed for cases "name = ' name = '" and "name = 'name='" - because the second occurrences in fact match as well. Your real requirement is "replace any field" which can be translated roughly as "replace anything that is not a string literal or operator".
This regular expression does not match even those that the regular expression (\w+\s*)(?==) matched Are you saying there is no solution to this problem via regular expressions?
I am able to search for words (may or may not have spaces) ending with '=' sign, all I want to add is that such words followed by '=' sign should not have a single quotation marks before (spaces could be allowed)
But the regular expression fails!
Any other solutions to this problem?>
|
 |
Piet Verdriet
Ranch Hand
Joined: Feb 25, 2006
Posts: 266
|
|
Try:
|
 |
Rama Krishna
Ranch Hand
Joined: Oct 16, 2007
Posts: 110
|
|
Great, I never knew you could replace a string so easily that way.
The following string fails the query constructed by = signs, my query contains only = sign.
String s = "where name = '123 =' or name is null and name = ' name=' ;";
s = s.replaceAll("[^\\s']++(?=\\s*+(?:=|\\bis\\b))", "aliasObjectName.$0");
I ended up using \w++(?=\s*=)(?=[^']*(?:'[^']*'[^']*)*$) from another forum:
Result:
(name='asf', after replacement, (name='asf'
( name = 'name' or name is null), after replacement, ( aliasObjectName.name = 'name' or name is null)
(name is null), after replacement, (name is null)
name = ' name = ', after replacement, aliasObjectName.name = ' name = '
( name='' and name like '%name%'), after replacement, ( name='' and name like '%name%')
(name = 'name='), after replacement, (aliasObjectName.name = 'name=')
|
 |
 |
|
|
subject: Regular expression to search for words not beginning with a single quote and with or without spaces
|
|
|