• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Regex matcher question

 
Ranch Hand
Posts: 51
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello, have an issue where I am almost there but not quite. I have below code


which outputs
john
do

but I want it to handle any and all special chars(like "!@#$%^&*()_+ etc) so i'd like to see below from original string.

"john"
"do"

How would I do this?
 
Bartender
Posts: 689
17
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Your code is matching against "[\w]", which is a character class containing word characters. It's equivalent to [a-zA-Z_0-9].

When you say you want to match special charters too, I assume you mean all characters but white space. If so, this can be done with the following character class: "[\S]".

Note that is a capital letter 'S'.

Edit to add: that character class is equivalent to [^s] (lower case 's') which is arguably clearer, if any regular expression can ever be said to be clear.

The Oracle tutorial will explain more: http://docs.oracle.com/javase/tutorial/essential/regex/index.html
 
Ranch Hand
Posts: 574
VI Editor Chrome Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
If you really want to grok regular expressions get Jeffrey Friedl's book Mastering Regular Expressions. It's even got a chapter on Java's regex engine.
 
author
Posts: 23951
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

First of all, it looks like you want to deal with nested quotes. In my opinion, regular expressions are not really good for that. It can get complicated to nest something, it can get really really complicated to nest two levels, and possibly, it is likely impossible to deal with in terms of complexity, if you want to nest an unlimited number of levels.

Also, second, can you deal with the quotes? For example, is the second quote a nested quote? Or does it close the first quote? How about the third quote? Or the fourth? Before you are able to create the regex for it, you probably need to better define it first.

Henry
 
steve kelly
Ranch Hand
Posts: 51
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Maybe regex is not what I want. Basically I can receive a string " "John" "Doe"" or "@John Doe#" in my program. The only thing I know is that a blank space will always separate them.
I want these two separate strings broken up into an array. So above examples would look like below:
"John"
"Doe"

and...

@John
Doe#
 
Mike. J. Thompson
Bartender
Posts: 689
17
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
If you can guarantee that format then you can do the following:

1) Split the string on the space, resulting in an array with the two parts in.

2) Remove the first character from the first String

3) Remove the last character from the second String.

You will need to validate that the string is in the correct form though, such as ensuring that the array has exactly two strings in and that the two characters you're removing are quotes.
 
reply
    Bookmark Topic Watch Topic
  • New Topic