• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

problem in writing Regular Expression

 
Dawood Mohammed
Greenhorn
Posts: 7
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,
I have a string from which i want to get strings seperated by space, which may also include string with double quotes.

for e.g: business political international "sports news" "local news" editorial

The result of parsing the above string with regular expression must be:
business
political
sports news
local news
editorial

please help me in solving this problem, this is pretty urgent.

Thanks and regards,
Dawood.
 
Paul Sturrock
Bartender
Posts: 10336
Eclipse IDE Hibernate Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What have you tried so far?
 
Dawood Mohammed
Greenhorn
Posts: 7
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have tried using lookahead, lookbehind regular expressions to solve this problem. but, couldnt work out how lookahead and lookbehind works
 
Joanne Neal
Rancher
Posts: 3742
16
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Dawood Mohammed:
please help me in solving this problem, this is pretty urgent.


Is it as urgent as your other problem ? I just need to know which to concentrate all my efforts on first.

Ease Up
 
Ernest Friedman-Hill
author and iconoclast
Marshal
Pie
Posts: 24208
35
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Neither of your two "regular expressions" problems are really very well-suited to regular expressions. They're really both simple state-machine type parsing problems, both simple enough to solve with an ad-hoc lexical analyzer, but each non-trivial enough to justify using something like ANTLR instead if you wanted to.
 
Dawood Mohammed
Greenhorn
Posts: 7
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
thanx for ur quick response. actually both the problems are very urgent for me as both of them are interrelated in my project.
 
Dawood Mohammed
Greenhorn
Posts: 7
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Friedman,
Thanx for ur response. how do we use antlr or state machines in java program can u please give an example as I dont have any idea about antlr.

Thanx in advance
 
Ernest Friedman-Hill
author and iconoclast
Marshal
Pie
Posts: 24208
35
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Here is a tutorial on doing a very simple parsing problem with ANTLR.
 
Kevin Davies
Greenhorn
Posts: 14
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
This is an easy regex problem, don't waste your time trying to write your own lexer :
 
Ernest Friedman-Hill
author and iconoclast
Marshal
Pie
Posts: 24208
35
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Nice. I stand corrected.
 
Dawood Mohammed
Greenhorn
Posts: 7
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanx a lot for ur solution Kevin Davies.

can u please give me the solution for my other question:

For the example below, can I write a regular expression to get key, value pairs.

example: ((abc def) (ghi jkl) (a ((b c) (d e))) (mno pqr) (a ((abc def))))

in the above example
abc is key & def is value
ghi is key & jkl is value
a is key & ((b c) (d e)) is value
and so on.


Thanks in advance
 
Alan Moore
Ranch Hand
Posts: 262
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

If you plug that into Kevin's code, you'll be able to retrive the keys via group(1) and the values via group(2). This works on the one line of sample data you provided, but YMMV. For instance, if you decide you need another level of parenthesis nesting, the regex will become about twice as ugly. I really think you're better off using a dedicated parser for this one.
 
Layne Lund
Ranch Hand
Posts: 3061
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Regular expressions are not well suited for parsing nested parentheses. You will need a slightly more complicated parser that can count how many "levels" there are in the nesting. If you want to investigate this further on your own, you should google for "context free language", "context free grammar", or "push down automata" as these are the theoretical foundations for doing such parsing.

HTH

Layne
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic