• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • Devaka Cooray
  • Ron McLeod
  • Jeanne Boyarsky
Sheriffs:
  • Liutauras Vilda
  • paul wheaton
  • Junilu Lacar
Saloon Keepers:
  • Tim Moores
  • Stephan van Hulst
  • Piet Souris
  • Carey Brown
  • Tim Holloway
Bartenders:
  • Martijn Verburg
  • Frits Walraven
  • Himai Minh

Java Pattern tokenize

 
Ranch Hand
Posts: 50
Hibernate Eclipse IDE Spring
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I have a String with pattern KEYWORD ARG1="x" ARG2="test test" that I need to tokenize. I tried using  Pattern. It gives first 2 groups and gives exception after that. Any help is appreciated. Thanks.

 


I get this output:

true
CLICK
name="a"
Exception in thread "main" java.lang.IndexOutOfBoundsException: No group 3
at java.util.regex.Matcher.group(Matcher.java:481)
at com.fepoc.fepdirect.shakeout.main.util.CommandParser.parseCommand(CommandParser.java:43)
at com.fepoc.fepdirect.shakeout.main.util.CommandParser.main(CommandParser.java:32)

 
author
Posts: 23928
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Your pattern only has two capturing groups. You can't capture something that isn't defined in your regex pattern.

Or another way to look at it. The group number is determined by where it is in the pattern. It is not determined by the order that it is matched. There could be a thousand ARGs in your string, and group 2 will only contain the last one.

Henry
 
Srikanth Madasu
Ranch Hand
Posts: 50
Hibernate Eclipse IDE Spring
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Then I suppose there is no way I can do it using regex.

I think I can come up with my own parser. First splitting the string on first space. and then apply split on quote followed by space.

Do you think of any other elegant way to do it?

And thanks for your time!
 
Sheriff
Posts: 27456
88
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I wouldn't advise you to limit yourself to "elegant" solutions. Really, you're looking for something that works.

And make sure you have the correct specs for the strings you're trying to parse. We only have one example of a valid string, which isn't nearly enough to start writing code for. For example: The regex which you tried in your original post restricts the "KEYWORD" part to being upper-case Latin letters only. Is that really the spec? You can't have "TOTAL2014INCOME" as a keyword, for example? Or "TotalIncome"? Same goes for the other parts -- in other specs (like XML for example) where you have attribute/value pairs and the value is delimited by quotes, there's often a feature where the value can contain a quote itself, so there's an escape character (or some other tool) to prevent that quote from being used as a delimiter. Does your input not have something like that? And does it matter if there are extra spaces here and there? Like two spaces (or a tab character) between the keyword and the first attribute, or between the attribute name and the "=" character which follows it? Make sure you have a good understanding of the spec before you start writing code -- if you look at the code for an XML parser, for example, you'll be looking at something which could never be described as "elegant".
 
Henry Wong
author
Posts: 23928
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Srikanth Madasu wrote:
Do you think of any other elegant way to do it?



Well, one option, since you are already using the find() method, is to use a loop and only get one term at a time. So, instead of this ...


... which gets the first and last term. You can do this...


... which gets only one term.... then place it in a loop to get one term at a time.

Of course, since you don't want the whitespaces, and actually using the find() method to get around it, then perhaps you need to slightly modify it to...


And... in a loop, group 1 will only be valid for the first iteration, while group 2 will be valid for the rest of the iterations, as long as the find() method returns true.

Henry
 
Wait for it ... wait .... wait .... NOW! Pafiffle! A perfect tiny ad!
the value of filler advertising in 2021
https://coderanch.com/t/730886/filler-advertising
reply
    Bookmark Topic Watch Topic
  • New Topic