File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Java in General and the fly likes How to split a String into variable length tokens ? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "How to split a String into variable length tokens ?" Watch "How to split a String into variable length tokens ?" New topic
Author

How to split a String into variable length tokens ?

Kumar Raja
Ranch Hand

Joined: Mar 18, 2010
Posts: 518
    
    2

Hi All,

Is there any built in library or api provided by Sun or other open source projects, where it would let me split a String into variable length tokens ?

For eg, If I have a string like below



If I want to split this string into tokens, based on variable length tokens, by providing size in some sort of regex (I could not put this in write wordings)

To be more specific, I should be able to split this string based on a copybook structure defined in COBOL. like

01 STRING1 PIC X(3)
02 STRING2 PIC X(5)
etc

I can write my userdefined function, but wondering if anything already existing.
Thanks



Regards
KumarRaja

Darryl Burke
Bartender

Joined: May 03, 2008
Posts: 4523
    
    5

If you know the lengths, just use substring(...)


luck, db
There are no new questions, but there may be new answers.
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 37976
    
  22
If you want to find a particular token to split on, you can use String#split(java.lang.String).
Or try one of the overloadings of String#indexOf() to find where to start your substrings.
fred rosenberger
lowercase baba
Bartender

Joined: Oct 02, 2003
Posts: 11154
    
  16

a regex will also let you specify how many characters to match:

[a-z]\{4,8\}

will match 4,5,6,7, or 8 lowercase letters.


There are only two hard things in computer science: cache invalidation, naming things, and off-by-one errors
Jeff Verdegan
Bartender

Joined: Jan 03, 2004
Posts: 6109
    
    6

fred rosenberger wrote:a regex will also let you specify how many characters to match:

[a-z]\{4,8\}

will match 4,5,6,7, or 8 lowercase letters.

Actually, it'd be:

(No backslash before the braces. I always get confused about what to backslash. It's different among Java, sed, egrep, and vi.)
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 7536
    
  18

Kumar Raja wrote:I can write my userdefined function, but wondering if anything already existing.

Seems to me you have 2 tasks:
1. Splitting your string based on predefined lengths.
2. Getting those predefined lengths from a piece of COBOL source.
And of the two, number 2 is going to be a lot harder (the first, as Darryl said, can be done with simple substring()s).

Winston


Isn't it funny how there's always time and money enough to do it WRONG?
Articles by Winston can be found here
fred rosenberger
lowercase baba
Bartender

Joined: Oct 02, 2003
Posts: 11154
    
  16

Jeff Verdegan wrote:
Actually, it'd be:

(No backslash before the braces. I always get confused about what to backslash. It's different among Java, sed, egrep, and vi.)

Thanks...yeah, I do a lot of trial and error with that myself...
Kumar Raja
Ranch Hand

Joined: Mar 18, 2010
Posts: 518
    
    2

Thank you all for the suggestions. To me regex approach seem to be more easy. Thanks.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: How to split a String into variable length tokens ?
 
Similar Threads
taking a particular data from a number of lines
problem in string's split method
grabing letter
SPLIT METHOD IN STRING CLASS
Break a string with special characters into segments with only text no spec chars ?