• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Splitting A Line Into Two or More Effectively

 
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I have one line formed as a continuous String as in:

.

I am not very sure what is the best way to split into pairs of <arg3>(line)</arg3>. I have experimented with using indexOf but noticed that it is not quite effectively. Someone also mentioned about using regex to split but I am not sure how effective can it be in using that for the long string above.

If it is, what is the right regex for that above?

Thanks.
 
Java Cowboy
Posts: 16084
88
Android Scala IntelliJ IDE Spring Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
What exactly do you mean with "effective"?

What did you try (show us your code)? Did it do what you expected, or not? What exactly do you expect?
 
Bartender
Posts: 1166
17
Netbeans IDE Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
and what do you mean by 'split' ? i.e. what do you want to do with the text between the <arg3>(line)</arg3> chunks?
 
Melvin Mah
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Ah okay. I think I wasn't clear enough.

Instead of one line as I am getting now, I want to break it to pairs (starting with <arg3>, ending with </arg3>). Each pair will be stored into a single String array.

The problem right now is that I am still not able to split it up.
 
Richard Tookey
Bartender
Posts: 1166
17
Netbeans IDE Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
And discard anything between? e.g. in </arg3><#INS#><arg3> to discard the <#INS#> ?
 
Melvin Mah
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Richard Tookey wrote:And discard anything between? e.g. in </arg3><#INS#><arg3> to discard the <#INS#> ?



Yes. <#INS#> can be discarded. It's a mere separator.
 
Richard Tookey
Bartender
Posts: 1166
17
Netbeans IDE Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I can immediately see two approaches -

1) Loop using indexof() first to find "<arg3>" and then to find "</arg3>" where the start point in each indexOf() is the last successful indexOf() result. Use substring to extract the part you want. Break the loop when either indexOf() fails.
2) Write a regular expression for split() that splits when one looks ahead to find "</arg3>" then reluctantly anything and then looks ahead to find ""<arg3>" .

I prefer the second option (it only takes one line) but if you are new to regular expressions you may find this difficult.
 
Melvin Mah
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Richard Tookey wrote:I can immediately see two approaches -

1) Loop using indexof() first to find "<arg3>" and then to find "</arg3>" where the start point in each indexOf() is the last successful indexOf() result. Use substring to extract the part you want. Break the loop when either indexOf() fails.
2) Write a regular expression for split() that splits when one looks ahead to find "</arg3>" then reluctantly anything and then looks ahead to find ""<arg3>" .

I prefer the second option (it only takes one line) but if you are new to regular expressions you may find this difficult.



You are right. Second option is easier. I tried this ([^\\;\\]*[^;<]) and it's as close to what I'm looking at. For <#INS#>, I just substituted with a ";" an easier delimiter. Not sure it's good or not.
 
Richard Tookey
Bartender
Posts: 1166
17
Netbeans IDE Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
That doesn't look at all right.
 
Rancher
Posts: 2759
32
Eclipse IDE Spring Tomcat Server
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Is the <INS#> seperator guaranetted to be there in your input stream? You could just seperate by the seperator...hence the name seperator
reply
    Bookmark Topic Watch Topic
  • New Topic