• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • Liutauras Vilda
  • Jeanne Boyarsky
  • paul wheaton
Sheriffs:
  • Ron McLeod
  • Devaka Cooray
  • Henry Wong
Saloon Keepers:
  • Tim Holloway
  • Stephan van Hulst
  • Carey Brown
  • Tim Moores
  • Mikalai Zaikin
Bartenders:
  • Frits Walraven

Regular Expressions in String.split()

 
Ranch Hand
Posts: 82
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

I want to split a String using a character | as the split-token. However when this token ist escaped with '\' i.e '\|' then the split should not take place at that point.
For example:

will split the string into 4 parts.
But how do prevent the second "bla" being split from the third bla?

The result should look like this:


I've tried all sorts of regexp combinations. None of them did what I expected.
I've spent hours on this problem and I'm about to give up.
I would really appreciate any help.

thank you in advance- Carcophan
 
Ranch Hand
Posts: 142
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Maybe it's not as efficient as using regexps, but why don't you write your own simple splitter in a few minutes?
 
Sheriff
Posts: 22815
132
Eclipse IDE Spring Chrome Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
First of all, I do not believe you if you say that using "|" will work. The vertical bar is a meta character in regular expressions. Also, your two lines of code won't even compile, as the \ should be escaped.Post Real Code.

In the JavaDoc of java.util.regex.Pattern, do a search for lookbehind and you should find a solution. Of course this solution will again break if you do want to split at \\|, then again not at \\\| etc. That's going to be quite a bit harder.
 
Ranch Hand
Posts: 1183
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I came up with this but for some reason instead of returning

[bla, bla\|bla, bla]

it returns

[bl, bla\|bl, bla]

which is close but where are the two a letters?
 
author
Posts: 23958
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

which is close but where are the two a letters?



The two a letters were used as part of the delimiters.

Try...




BTW, to the original poster... I would be hesitant on using any of the solutions posted in this topic. From your question, it looks like you are a beginner with regexes, and is unlikely to understand the solution posted. And it is never a good idea to use something that you don't understand.

Henry
 
Sebastian Janisch
Ranch Hand
Posts: 1183
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Henry Wong wrote:

which is close but where are the two a letters?



The two a letters were used as part of the delimiters.

Try...



Henry>




@Joe Carco ... This is what you want. ..
 
Joe carco
Ranch Hand
Posts: 82
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Rob Prime wrote:First of all, I do not believe you if you say that using "|" will work. The vertical bar is a meta character in regular expressions. Also, your two lines of code won't even compile, as the \ should be escaped.Post Real Code.

In the JavaDoc of java.util.regex.Pattern, do a search for lookbehind and you should find a solution. Of course this solution will again break if you do want to split at \\|, then again not at \\\| etc. That's going to be quite a bit harder.



ok I admit its not real code but the problem is real. The input I'm parsing does in fact have the pipe "|" as a segment marker that needs to be split, and a "\" as an escape character. I decided not to copy/paste any code but just post the code from my memory.

@Henry, Thank you so much for your help. I didn't want to wite my own String splitter becuase I was certain that it could be done with regular expressions
 
Sebastian Janisch
Ranch Hand
Posts: 1183
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

@Henry, Thank you so much for your help. I didn't want to wite my own String splitter becuase I was certain that it could be done with regular expressions



Note though that your custom splitter could be faster than employing the heavy regex engine.
 
Henry Wong
author
Posts: 23958
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Sebastian Janisch wrote:

@Henry, Thank you so much for your help. I didn't want to wite my own String splitter becuase I was certain that it could be done with regular expressions



Note though that your custom splitter could be faster than employing the heavy regex engine.



Also, it could have been done in 15 minutes. Instead, you "spent hours on this problem", gave up, got the solution here, and now, have a solution that you don't understand. It that really better?

Henry
 
Joe carco
Ranch Hand
Posts: 82
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
yes it is. writing a custom splitter would have been a boring task. now i had the chance to delve into regex a bit. learnt something new today! cheers
 
Ranch Hand
Posts: 266
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Sebastian Janisch wrote:

Henry Wong wrote:

which is close but where are the two a letters?



The two a letters were used as part of the delimiters.

Try...



Henry>




@Joe Carco ... This is what you want. ..



It will not work if a backslash is part of the text and comes just before the pipe character:

Of course, this might never occur in the OP's input... But if it is possible, the OP should devise a different solution (or a bit more tricky split(...) regex).
 
Henry Wong
author
Posts: 23958
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

It will not work if a backslash is part of the text and comes just before the pipe character:



Not exactly sure what you mean. Isn't this what the OP wanted? To not split when the pipe character is escaped?

Henry
 
Piet Verdriet
Ranch Hand
Posts: 266
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Henry Wong wrote:

It will not work if a backslash is part of the text and comes just before the pipe character:



Not exactly sure what you mean. Isn't this what the OP wanted? To not split when the pipe character is escaped?

Henry



Say the OP wants to split on the unescaped pipe:

and

But what if the text can contain a backslash that is not used to escape the pipe symbol? A natural choice would be to escape that backslash like this:

The solution(s) proposed in this thread will also split on the pipe before 'c' while that might not be the OP intention.
But, like I said: this might very well not occur in the OP's input, but if it can occur, I thought I'd just mention it.

In short: the OP might be looking for a way to split on the pipe only if the pipe has an uneven number of backslashes before it.
 
Henry Wong
author
Posts: 23958
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

In short: the OP might be looking for a way to split on the pipe only if the pipe has an uneven number of backslashes before it.



Interesting. I never even saw it. Good catch.

Henry
 
Rob Spoor
Sheriff
Posts: 22815
132
Eclipse IDE Spring Chrome Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Which I already mentioned:

Rob Prime wrote:In the JavaDoc of java.util.regex.Pattern, do a search for lookbehind and you should find a solution. Of course this solution will again break if you do want to split at \\|, then again not at \\\| etc. That's going to be quite a bit harder.

 
Piet Verdriet
Ranch Hand
Posts: 266
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Rob Prime wrote:Which I already mentioned:

Rob Prime wrote:In the JavaDoc of java.util.regex.Pattern, do a search for lookbehind and you should find a solution. Of course this solution will again break if you do want to split at \\|, then again not at \\\| etc. That's going to be quite a bit harder.



Indeed, missed your response!
 
Henry Wong
author
Posts: 23958
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hmmm.... How about this for checking odd number of backslashes?



Unfortunately, this can only check for odd number of backslashes up til 2001 backslashes, then it breaks...


To Joe Carco, don't you wish you wrote your own "boring" regex instead, now? ...

Henry>
 
Piet Verdriet
Ranch Hand
Posts: 266
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Henry Wong wrote:Hmmm.... How about this for checking odd number of backslashes?



Unfortunately, this can only check for odd number of backslashes up til 2001 backslashes, then it breaks...


To Joe Carco, don't you wish you wrote your own "boring" regex instead, now? ...

Henry>





You could always replace 1000 with Integer.MAX_VALUE
 
Rob Spoor
Sheriff
Posts: 22815
132
Eclipse IDE Spring Chrome Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Make that Long.MAX_VALUE just to be sure.
 
Joe carco
Ranch Hand
Posts: 82
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator



FREAK!
 
This looks like a job for .... legal tender! It says so right in this tiny ad:
Gift giving made easy with the permaculture playing cards
https://coderanch.com/t/777758/Gift-giving-easy-permaculture-playing
reply
    Bookmark Topic Watch Topic
  • New Topic