• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Regular expression help with unicode

 
Greenhorn
Posts: 5
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

I'm looking for regular expression where String consists of a mixture of unicode letters and numerical digits only, with at least one of each.

The closest I got to is Java String: "^[\\p{L}\\p{N}]+$" The problem is, this still allows if the string is entirely numbers or entirely letters, so strings like "1234567" or "abcdef" still match.

Does anyone know how to fix this?

Thanks,
Sri
 
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
This seems to work. There's probably a cleaner way with a single regex, and, honestly, it would be clearer just to use 2 different regexes--one that says "contains at least one letter" and one that says "contains at least one digit", and then test that it matches regex1 AND matches regex2.



Breaking it apart:


The idea is that if there's at least one digit and at least one letter, and all characters are either digit or letter, then somewhere we must have either digit followed by letter or letter followed by digit. It could be at the beginning, the middle, or the end, so the "either/or" character classes have to have the zero-or-more qualifier. That is, zero characters then LN or NL, or some Ls and/or Ns followed by LN or NL, then zero or more Ls and/or Ns.

Also note that if you're using String.matches() or Matcher.matches(), you don't need the ^ and $, since matches() attempts to match against the entire input anyway.

 
Sri Ponnapalli
Greenhorn
Posts: 5
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Great, this works like magic!! Thank you very much Jeff, for a very quick response!

Sri
 
Jeff Verdegan
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You're very welcome!

Be warned, though, I expect it runs pretty slowly. If you're doing a large number of tests in a row, or testing against a very large input, it might be a problem. In that case, you'll have to either find somebody who's better at regex than I, or just break it into two separate tests (or perhpas 3) like I suggested.
 
Sri Ponnapalli
Greenhorn
Posts: 5
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Got it, it is not a very high volume scenario, so this should be good. One other question. Since this solution is internationalized, I'm trying to test with unicode characters. Do you know any site where I can get good unicode character set to test this with?

Thanks,
Sri
 
Jeff Verdegan
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Sri Ponnapalli wrote:Do you know any site where I can get good unicode character set to test this with?



There's a pretty decent sampling at http://en.wikipedia.org/wiki/List_of_Unicode_characters.

Oh, and welcome to the Ranch!
 
Bartender
Posts: 5167
11
Netbeans IDE Opera Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Sri, please BeForthrightWhenCrossPostingToOtherSites
http://www.java-forums.org/new-java/54512-help-unicode-regular-expression.html
https://forums.oracle.com/forums/thread.jspa?threadID=2337867
 
Sri Ponnapalli
Greenhorn
Posts: 5
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Sorry Darryl, I wasn't aware of the cross-posting rules. Will certainly play by the rules in future.

Thanks again Jeff, your answer DID help me a lot! (I didn't hear on any of the other forums)

Sri
 
Jeff Verdegan
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Sri Ponnapalli wrote:Sorry Darryl, I wasn't aware of the cross-posting rules. Will certainly play by the rules in future.

Thanks again Jeff, your answer DID help me a lot! (I didn't hear on any of the other forums)

Sri



You're welcome! Glad I could help!

Please go back to the other forums and let them know that it's been answered (so people don't waste their time) and provide a link to this one in case anybody is interested in the solution, or wants to improve on it.
 
Sri Ponnapalli
Greenhorn
Posts: 5
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I verified, and it was already done by Darryl

Sri
reply
    Bookmark Topic Watch Topic
  • New Topic