• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

regular expression

 
Greenhorn
Posts: 7
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I want a regular expression to split a String on whitespace except not when the whitespace is within brackets.

So for example "The cat [in the] hat".split(regularExpression) should evaluate to "The","cat","[in the]","hat".

If the regularExpression were "\\s+" it works for whitespace alone. How can I change this to not split when the whitespace is in brackets?

Tips welcome. Thanks!

Brian
 
Sheriff
Posts: 7023
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Moving this to the Intermediate forum...
 
author
Posts: 3252
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Use a zero-width (negative) look-behind (?<!) to verify that the space isn't preceded by a "[" without balancing "]":

(?<!\[[^\]]*)\s+

Be warned that I didn't try this, but it should be close. Regular expressions, the most useful write-only medium in existence Don't forget to escape the backslashes if you put this in a String literal.

- Peter
 
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Mmmmm, I don't think that will work well, Peter. (Welcome back, BTW!) I get:

Java.util.regex only does lookbehind if it can determine a maximum length to what it's looking for. Other regex packages may only do it if the expression has a single fixed length. Friedl discusses this, but I don't have my copy with me right now.

It may be possible to adapt this pattern for the problem at hand, but I don't see an easy way to do it. I think it will be easier to write a pattern to match the non-whitespace (or anything enclosed in braces) instead. E.g.:

[ June 05, 2004: Message edited by: Jim Yingst ]
 
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
For the problem, as stated, the following should work.



However, it doesn't work with nested structures.

What we're saying here is the following. A space, that is followed any sequence of characters, so long as those characters are anything other then a open or closed bracket, but which end with a closed bracket


HTH,
M
 
Brian Buckley
Greenhorn
Posts: 7
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
My god that works. Regular expressions freak me out.

I have to sit down study this...

Brian
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I hear some idiot recently wrote a book on regex & Java.

M
 
Peter den Haan
author
Posts: 3252
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
And a good book on regex & Java to boot. What was his name again? Something that sounded a bit like a cool holiday destination. M... Max... Max... Oh, I don't know.

- Peter
 
Consider Paul's rocket mass heater.
reply
    Bookmark Topic Watch Topic
  • New Topic