wood burning stoves*
The moose likes Beginning Java and the fly likes Regular Expression for Detecting Emoticons Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of OCM Java EE 6 Enterprise Architect Exam Guide this week in the OCMJEA forum!
JavaRanch » Java Forums » Java » Beginning Java
Bookmark "Regular Expression for Detecting Emoticons" Watch "Regular Expression for Detecting Emoticons" New topic
Author

Regular Expression for Detecting Emoticons

Chia-you Chai
Greenhorn

Joined: Jul 22, 2006
Posts: 13
Hello, I have a question on detecting the emoticons in a string.

If I have a sentence "

How can I use regular expression to extract the emoticons and in this sentence ?

Thank you helping !!
Nicola Garofalo
Ranch Hand

Joined: Apr 10, 2010
Posts: 308
Unfortunately i am not so good with regular expressions since i don't use them so often and i can't practice them as i would, but i would like to try to give you an answer because i think regular exepressions are a really expressive language.
Then

i arrived to this one



If i am not wrong this expression recognises any character (zero or more times) followed by or sequence followed by any character (zero or more times)

But, as i told you, i am not so good with it and you surely find better answers


Bye,
Nicola
Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19685
    
  20

That's indeed not such a good regex ;)

It says "any character any number of times" followed by a : or ) (because the [] denote a character class, or a choice of characters) or a : or D, followed again by any character any number of times.

The regular expression needed is simple in this case:
- a :. Use the literal.
- a ) or D. Use either a character class or a | for this.

So you'd get ":[)D]" or ":(\\)|D)". Note that in the second example you need to escape the ). The first form is easier but allows extending only with two-character like :(. The regex would become ":[)D(]". The second form allows you to add any number of characters after the :. With additions :(, :'( and :wink: the regex would become ":(\\(|D|\\(|'\\(|wink:)". With a given set of smilies that can be automatically generated, using Pattern.quote (the : is taken inside the parentheses so you can also add smilies like ;) that don't start with :):
This will create a regex like this: Don't be put off by those \Q and \E; \Q means to treat everything until the next \E as literals instead of meta characters. See also the Javadoc of java.util.regex.Pattern.


SCJP 1.4 - SCJP 6 - SCWCD 5 - OCEEJBD 6
How To Ask Questions How To Answer Questions
Nicola Garofalo
Ranch Hand

Joined: Apr 10, 2010
Posts: 308
I was sure of it
Thank you for the explanation
Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19685
    
  20

You're welcome.
Chia-you Chai
Greenhorn

Joined: Jul 22, 2006
Posts: 13
Thanks for reply ! That's really a nice post to improve my knowledge about RE.
However, when I try to apply this regular expression to detect emoticon, I only can get the first emoticon in the sentence.


Is there any approach that I get get all the emoticons in the sentence ?

Thanks again
Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19685
    
  20

Change the if into a while would be a start
Chia-you Chai
Greenhorn

Joined: Jul 22, 2006
Posts: 13
Yes ! I change to and it works ! Thanks for helpin !
 
It is sorta covered in the JavaRanch Style Guide.
 
subject: Regular Expression for Detecting Emoticons