• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Regex question

 
trupti nigam
Ranch Hand
Posts: 626
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I need to do below check on the entered string.
No Special character other than colon,hyphen,period and underscore are entered.
In order to achieve this I do the below, but that experession is not effective for the hyphen(-). What I need to change to include hyphen in the ignore list.



In the above pattern, if I include -, it does not work.

Also how to achieve below,

The String should not have sequence of multiple consecutive special chars. How do I check this?


thanks
Trupti
 
Henry Wong
author
Marshal
Pie
Posts: 21021
78
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
trupti nigam wrote:I need to do below check on the entered string.
No Special character other than colon,hyphen,period and underscore are entered.
In order to achieve this I do the below, but that experession is not effective for the hyphen(-). What I need to change to include hyphen in the ignore list.



In the above pattern, if I include -, it does not work.

Also how to achieve below,

The String should not have sequence of multiple consecutive special chars. How do I check this?


Put the hyphen last -- anywhere else, and it will try to do a range of characters (BTW, you can also escape it with a backslash).

Henry
 
Richard Tookey
Bartender
Posts: 1166
17
Java Linux Netbeans IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
1) Since '-' is used as a meta character when within a character set it must be the first or last member of the set to have it's natural meaning.

2) You can refer to a previous group content using "\n" where n is the group number. So, assuming you only have the one capturing group, then two consecutive characters the same is detected using "(.)\1" .
 
trupti nigam
Ranch Hand
Posts: 626
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Richard Tookey wrote:1) Since '-' is used as a meta character when within a character set it must be the first or last member of the set to have it's natural meaning.

2) You can refer to a previous group content using "\n" where n is the group number. So, assuming you only have the one capturing group, then two consecutive characters the same is detected using "(.)\1" .


Can you explain 2) further by writing some example.

thanks
Pradnya
 
Richard Tookey
Bartender
Posts: 1166
17
Java Linux Netbeans IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
trupti nigam wrote:
Richard Tookey wrote:1) Since '-' is used as a meta character when within a character set it must be the first or last member of the set to have it's natural meaning.

2) You can refer to a previous group content using "\n" where n is the group number. So, assuming you only have the one capturing group, then two consecutive characters the same is detected using "(.)\1" .


Can you explain 2) further by writing some example.

thanks
Pradnya

Err ... Assuming you are referring to my second point - I have given an example!
 
trupti nigam
Ranch Hand
Posts: 626
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Richard Tookey wrote:
trupti nigam wrote:
Richard Tookey wrote:1) Since '-' is used as a meta character when within a character set it must be the first or last member of the set to have it's natural meaning.

2) You can refer to a previous group content using "\n" where n is the group number. So, assuming you only have the one capturing group, then two consecutive characters the same is detected using "(.)\1" .


Can you explain 2) further by writing some example.

thanks
Pradnya

Err ... Assuming you are referring to my second point - I have given an example!


So Does that mean I need to do below.
 
Richard Tookey
Bartender
Posts: 1166
17
Java Linux Netbeans IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
trupti nigam wrote:

So Does that mean I need to do below.



That will only match "a new line followed by a none-special character followed by two characters the same" which is probably not what you want (though I can't be certain since you have only provided small fragments of a specification and given little or no context). I think you need to spend some time with http://docs.oracle.com/javase/tutorial/essential/regex/ and http://www.regular-expressions.info/tutorial.html .
 
Winston Gutkowski
Bartender
Pie
Posts: 10273
60
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
trupti nigam wrote:...The String should not have sequence of multiple consecutive special chars. How do I check this?

It seems you're getting good advice, so I won't try to repeat it.

What I will say is: don't try to do this all in one regex.

Winston
 
Richard Tookey
Bartender
Posts: 1166
17
Java Linux Netbeans IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:
What I will say is: don't try to do this all in one regex.


The Devil is in the context but the OP has not provided one. Reading between the lines ( i.e. guessing what the OP wants ) this should be simple to do in one regex using an "or" so one regex is probably OK but we will see when the context is posted.
 
trupti nigam
Ranch Hand
Posts: 626
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Richard Tookey wrote:
Winston Gutkowski wrote:
What I will say is: don't try to do this all in one regex.


The Devil is in the context but the OP has not provided one. Reading between the lines ( i.e. guessing what the OP wants ) this should be simple to do in one regex using an "or" so one regex is probably OK but we will see when the context is posted.


I am not sure when you say I have not provided the context. Let me try again.

String name= "alex:zang%^";

Now the regex should detect that in the above string the second portion of the string after ":" has consecutive special chars exluding [^a-zA-Z0-9:_.-] and it should reject it.
But if the String is like"alex:zang" or "AlexZang" it should pass the test.
But again if the second portion of the string i.e. zang has any single special char it will fail with my previous line of code like below.



HAve I made it clear?
 
Henry Wong
author
Marshal
Pie
Posts: 21021
78
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Richard Tookey wrote:
Winston Gutkowski wrote:
What I will say is: don't try to do this all in one regex.


The Devil is in the context but the OP has not provided one. Reading between the lines ( i.e. guessing what the OP wants ) this should be simple to do in one regex using an "or" so one regex is probably OK but we will see when the context is posted.


The OP definitely needs to give full details as there is lots of missing context. First, the character class in the regex is a negative search, which I assume means that if the regex succeeds, the operation will fail -- the code is looking for invalid characters. Second, the follow up request, which is looking for consecutive characters likely means consecutive valid characters -- and there is no way to match both at the same time (never mind this last point, in thinking about it some more, I guess it is possible to merge those two cases with the alternation operator).

Yes, it is possible to do both at the same time, but you need to change the logic (as you need the regex to find valid patterns). Then you need to make consecutive special characters invalid.

Henry
 
Henry Wong
author
Marshal
Pie
Posts: 21021
78
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
trupti nigam wrote:
I am not sure when you say I have not provided the context. Let me try again.


Not provided enough context means that you haven't explained it correctly.... for example, you last subdiscussion with Richard is moot, because what Richard interpreted (which was also what I interpreted) is not what you described next.


trupti nigam wrote:
String name= "alex:zang%^";

Now the regex should detect that in the above string the second portion of the string after ":" has consecutive special chars exluding [^a-zA-Z0-9:_.-] and it should reject it.
But if the String is like"alex:zang" or "AlexZang" it should pass the test.
But again if the second portion of the string i.e. zang has any single special char it will fail with my previous line of code like below.



HAve I made it clear?


So, you only want it to be considered as having a special character only if two or more of them exist consecutively? Then your regex should be ... "[^a-zA-Z0-9:_.-]{2,}"


Also, this regex doesn't have the concept of "the second portion of the string after ":"" -- meaning it will also trigger if two consecutive special characters occurs before the ":". The fix for this issue isn't very difficult, but I am not a fan of using something that you don't understand. I really recommend starting again, with a tutorial on regular expressions.

Henry
 
trupti nigam
Ranch Hand
Posts: 626
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator


So, you only want it to be considered as having a special character only if two or more of them exist consecutively? Then your regex should be ... "[^a-zA-Z0-9:_.-]{2,}"

Henry


Ok Let me rephrase above.
1. No Special chars other than colon,hyphen,period and underscore are entered
2.The string does not have sequence of multiple consecutive special chars

So the above means the check should fail for below:

alexzang==> pass
alex:zang==>pass
alex%zang==> fail
alex.zang==>pass
alex..zang==>fail
alex.*zang==>fail
alex::zang==>fail
alex$%^zang==>fail
alex._zang==>fail


This is what I was told.
 
Henry Wong
author
Marshal
Pie
Posts: 21021
78
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
trupti nigam wrote:
Ok Let me rephrase above.
1. No Special chars other than colon,hyphen,period and underscore are entered
2.The string does not have sequence of multiple consecutive special chars

This is what I was told.



This description is completely different than your previous post. And it goes back to the interpretation that Richard and I thought it was. I seriously recommend that you get clarification from your instructor, because what you are saying here, and your previous post do not match.

Henry
 
Richard Tookey
Bartender
Posts: 1166
17
Java Linux Netbeans IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Also, you are going to seriously upset people outside of the US and UK (which is most of the world) who use characters other than those in [A-Za-z] . You should most definitely seek clarification on the specification details.
 
Winston Gutkowski
Bartender
Pie
Posts: 10273
60
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
trupti nigam wrote:1. No Special chars other than colon,hyphen,period and underscore are entered
2.The string does not have sequence of multiple consecutive special chars

Oddly enough I did get it right, and I repeat: don't try to do both those tests in one regex; it will be horrible.

I also strongly suggest that you make your pattern:
Pattern p = Pattern.compile("[^a-zA-Z0-9:_.-]+");
which will find the longest sequence of characters that match (your current one only matches one character), and use a Matcher to run the logic you need.

Winston
 
trupti nigam
Ranch Hand
Posts: 626
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am able to achieve both the conditions using below code.

 
Jeff Verdegan
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
trupti nigam wrote:I am able to achieve both the conditions using below code.



Your second one would be simpler as:


Although it doesn't mean the same thing as your in isolation, since you're using these regeces together and failure occurs if EITHER you have any single char outside the first range OR you have two allowed special characters in a row, in the context of using those two regexes together, the result is the same. That is, if your original second regex matched on something other than the allowed special chars, your first one would have matched anyway and already indicated failure.

I guess it's a matter of personal opinion which approach is easier to understand overall.
 
Richard Tookey
Bartender
Posts: 1166
17
Java Linux Netbeans IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
trupti nigam wrote:
Ok Let me rephrase above.
1. No Special chars other than colon,hyphen,period and underscore are entered
2.The string does not have sequence of multiple consecutive special chars

So the above means the check should fail for below:

alexzang==> pass
alex:zang==>pass
alex%zang==> fail
alex.zang==>pass
alex..zang==>fail
alex.*zang==>fail
alex::zang==>fail
alex$%^zang==>fail
alex._zang==>fail


This is what I was told.


This is straightforwards and met by using Matcher.find() with the regex "[^a-zA-Z:._-]|[:._-]{2}" . No need for two separate regex.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic