File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Java in General and the fly likes regular expression Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of EJB 3 in Action this week in the EJB and other Java EE Technologies forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "regular expression" Watch "regular expression" New topic
Author

regular expression

Brian Buckley
Greenhorn

Joined: Jan 12, 2003
Posts: 7
I want a regular expression to split a String on whitespace except not when the whitespace is within brackets.

So for example "The cat [in the] hat".split(regularExpression) should evaluate to "The","cat","[in the]","hat".

If the regularExpression were "\\s+" it works for whitespace alone. How can I change this to not split when the whitespace is in brackets?

Tips welcome. Thanks!

Brian
Dirk Schreckmann
Sheriff

Joined: Dec 10, 2001
Posts: 7023
Moving this to the Intermediate forum...


[How To Ask Good Questions] [JavaRanch FAQ Wiki] [JavaRanch Radio]
Peter den Haan
author
Ranch Hand

Joined: Apr 20, 2000
Posts: 3252
Use a zero-width (negative) look-behind (?<!) to verify that the space isn't preceded by a "[" without balancing "]":

(?<!\[[^\]]*)\s+

Be warned that I didn't try this, but it should be close. Regular expressions, the most useful write-only medium in existence Don't forget to escape the backslashes if you put this in a String literal.

- Peter
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
Mmmmm, I don't think that will work well, Peter. (Welcome back, BTW!) I get:

Java.util.regex only does lookbehind if it can determine a maximum length to what it's looking for. Other regex packages may only do it if the expression has a single fixed length. Friedl discusses this, but I don't have my copy with me right now.

It may be possible to adapt this pattern for the problem at hand, but I don't see an easy way to do it. I think it will be easier to write a pattern to match the non-whitespace (or anything enclosed in braces) instead. E.g.:

[ June 05, 2004: Message edited by: Jim Yingst ]

"I'm not back." - Bill Harding, Twister
Max Habibi
town drunk
( and author)
Sheriff

Joined: Jun 27, 2002
Posts: 4118
For the problem, as stated, the following should work.



However, it doesn't work with nested structures.

What we're saying here is the following. A space, that is followed any sequence of characters, so long as those characters are anything other then a open or closed bracket, but which end with a closed bracket


HTH,
M


Java Regular Expressions
Brian Buckley
Greenhorn

Joined: Jan 12, 2003
Posts: 7
My god that works. Regular expressions freak me out.

I have to sit down study this...

Brian
Max Habibi
town drunk
( and author)
Sheriff

Joined: Jun 27, 2002
Posts: 4118
I hear some idiot recently wrote a book on regex & Java.

M
Peter den Haan
author
Ranch Hand

Joined: Apr 20, 2000
Posts: 3252
And a good book on regex & Java to boot. What was his name again? Something that sounded a bit like a cool holiday destination. M... Max... Max... Oh, I don't know.

- Peter
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: regular expression
 
Similar Threads
RegularExpression
RegularExpression Pattern
RegularExpression for string replacement
RegularExpression in Java