Win a copy of Design for the Mind this week in the Design forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Regular expression to search for words not beginning with a single quote and with or without spaces

 
Rama Krishna
Ranch Hand
Posts: 110
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
In a sql where clause constructed without the alias names, the query would be something like

where name = '123' or name is null and name = 'name';


and I am trying to achieve the following,

where aliasObjectName.name = '123' or aliasObjectName.name is null and aliasObjectName.name = 'name';


whereas the following code fails the test case making it

where aliasObjectName.name = '123' or aliasObjectName.name is null and aliasObjectName.name = 'aliasObjectName.name';


in the last replacement which is wrong, I would appreciate if someone could help me with the right regular expression and the syntax reasoning.

I am assuming that (?') in the beginning of the regular expression should avoid looking for those words beginning with ' or single quote characters, i.e., searching for var(?!=) would match var in 'something=var' but not the var in 'var=something'

 
Henry Wong
author
Marshal
Pie
Posts: 20997
76
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am assuming that (?') in the beginning of the regular expression should avoid looking for those words beginning with ' or single quote characters,


A "(?!')" is a zero length negative look ahead. Basically, it is saying from the point in the match, look forwards, and make sure that the match doesn't happen. And BTW, it is zero-width, meaning it won't be counted toward the match.

i.e., searching for var(?!=) would match var in 'something=var' but not the var in 'var=something'


yes.

[EDIT: Sorry messed up the regex terms... fixed what I said.]

Henry
 
Henry Wong
author
Marshal
Pie
Posts: 20997
76
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The beginning of this pattern...



Doesn't make much sense. You are basically saying look ahead to make sure that the next character isn't a single quote, but from that point the next character is the first character of the match, which is a word character.

Perhaps you wanted a look behind instead?

Henry
 
Rama Krishna
Ranch Hand
Posts: 110
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
so it should be more like



suggesting negative look behind without ' single quotation marks, but this fails the test case too. I am missing something more here.>
 
Rob Spoor
Sheriff
Pie
Posts: 20511
54
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
\\w{1,}\\s?(?==))(?<!')

That basically means:
1 or more word characters (please use \\w+ - that's what the + is for!)
followed by 1 or 0 whitespace characters
followed by a = but not preceded by a '

Try moving that lookbehind to the start. After all, you want the \\w+ to not have a ' in front of it: (?><!')(\\w+)\\s*(?==)
Notice how I use \\s* instead of \\s? - this allows more than 1 space should you have those by accident. Also, since the whitespace is part of your match, make sure to use group(1) to find the \\w+, and include a single space. That way you normalize it as well.>
 
Rama Krishna
Ranch Hand
Posts: 110
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Oops, fails all test cases



Result:

(name='asf', after replacement becomes, (name='asf'
( name = 'name' or name is null), after replacement becomes, ( name = 'name' or name is null)
(name is null), after replacement becomes, (name is null)
name = ' name = ', after replacement becomes, name = ' name = '
( name='' and name like '%name%'), after replacement becomes, ( name='' and name like '%name%')
(name = 'name='), after replacement becomes, (name = 'name=')


The regular expression (?<!')(\\w+)\\s*(?==) fails as well.>
 
Henry Wong
author
Marshal
Pie
Posts: 20997
76
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The regular expression (?><!')(\\w+)\\s*(?==) fails as well.>


Unfortunately, something in the posting system is adding ">" signs -- and messing up the regex. So, we can't tell what is the regex that you actually used. This includes the regexes posted by others, so hopefully, you didn't copy it blindly.

Henry
 
Rob Spoor
Sheriff
Pie
Posts: 20511
54
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
That's because the start should be (?<!'), without the >.

That still misses things like "name is null" or "name like '%name'" but that's because you're simply not looking for "is" or like.
In the end I think that regular expressions are not the way to solve this issue, because it will be hard to find a regular expression for all solutions. I tried "(?<!')(\\w+)(\\s*)(?==|is|like|in)" but that still failed for cases "name = ' name = '" and "name = 'name='" - because the second occurrences in fact match as well. Your real requirement is "replace any field" which can be translated roughly as "replace anything that is not a string literal or operator".
 
Rama Krishna
Ranch Hand
Posts: 110
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Henry, I think I knew what Rob was suggesting, when ever the regular expression (?<!') went into the CODE tags, it would become (?><!') there is no problem if you show that >< symbol as < in the CODE tags, I exactly see what I want to show and must be the same when others show too!

Rob Prime wrote:That's because the start should be (?<!'), without the >.


you meant lesser than sign right, meaning negative look behind because I don't see the greater than sign in your above quote.

That still misses things like "name is null" or "name like '%name'" but that's because you're simply not looking for "is" or like.
In the end I think that regular expressions are not the way to solve this issue, because it will be hard to find a regular expression for all solutions. I tried "(?<!')(\\w+)(\\s*)(?==|is|like|in)" but that still failed for cases "name = ' name = '" and "name = 'name='" - because the second occurrences in fact match as well. Your real requirement is "replace any field" which can be translated roughly as "replace anything that is not a string literal or operator".


This regular expression does not match even those that the regular expression (\w+\s*)(?==) matched Are you saying there is no solution to this problem via regular expressions?

I am able to search for words (may or may not have spaces) ending with '=' sign, all I want to add is that such words followed by '=' sign should not have a single quotation marks before (spaces could be allowed)

But the regular expression fails!

Any other solutions to this problem?>
 
Piet Verdriet
Ranch Hand
Posts: 266
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Try:

 
Rama Krishna
Ranch Hand
Posts: 110
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Great, I never knew you could replace a string so easily that way.

The following string fails the query constructed by = signs, my query contains only = sign.

String s = "where name = '123 =' or name is null and name = ' name=' ;";
s = s.replaceAll("[^\\s']++(?=\\s*+(?:=|\\bis\\b))", "aliasObjectName.$0");

I ended up using \w++(?=\s*=)(?=[^']*(?:'[^']*'[^']*)*$) from another forum:



Result:
(name='asf', after replacement, (name='asf'
( name = 'name' or name is null), after replacement, ( aliasObjectName.name = 'name' or name is null)
(name is null), after replacement, (name is null)
name = ' name = ', after replacement, aliasObjectName.name = ' name = '
( name='' and name like '%name%'), after replacement, ( name='' and name like '%name%')
(name = 'name='), after replacement, (aliasObjectName.name = 'name=')
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic