my dog learned polymorphism*
The moose likes Java in General and the fly likes Regular Expression Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Regular Expression" Watch "Regular Expression" New topic
Author

Regular Expression

Abi Raj
Greenhorn

Joined: Jan 28, 2005
Posts: 28
I want to split the following string into an array of strings

1,3,6,"1,243", "1,097,937", 89

the array should be like this
1
3
6
1,243
1,097,937
89

Can any one give me the regular expression to use in split function.. or is there any other way out there

Thanks
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18529
    
  40

Regular expressions is actually not very good for CSV -- mainly because of the nesting nature of quotes.

However, if you only want quotes to be one level deep, you could try this...



Henry


Books: Java Threads, 3rd Edition, Jini in a Nutshell, and Java Gems (contributor)
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18529
    
  40

Sorry, I don't think I answered the question. Hopefully, I can be proven wrong on this, but ...

I don't think it is possible to split using a regular expression. The quoting nature will require some combination of look-ahead (and look-behinds). And I don't think variable length look-aheads and look-behinds are supported with the split operation.

Henry
Alan Moore
Ranch Hand

Joined: May 06, 2004
Posts: 262
You're right, Henry; split() is virtually useless when it comes to CSV data. A positive matching approach like the one you suggested will work, but it can't be counted on to fail when the input is malformed--it will just return garbage results. For CSV data, you're much better off using a dedicated CSV parser/tokenizer. There are several good, free ones floating around.
Abi Raj
Greenhorn

Joined: Jan 28, 2005
Posts: 28
Hi Henry & Alan,

Thanks for your reply. The solution given by Henry is working fine.. But I have another problem. Suppose if I have a double quotes within another double quotes it is not parsing properly.

Say for example I have
1,3,6,"1,\"243", "1,097,937", 89

the array should be like this
1
3
6
1,"243
1,097,937
89

Can anyone help me?

Thanks in Advance.
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18529
    
  40

Abi,

Did you pay attention to the conversation between Alan and I? To summarize, regular expressions are not good for CSV data. Even with one level deep, (and no error checking), it can get ridiculously complex.

I really suggest that you get a CSV parser... but... I will give you one last enhancement.



Henry
Abi Raj
Greenhorn

Joined: Jan 28, 2005
Posts: 28
Hi Henry,

I noted your point. But I thought to solve this problem regular expressions are enough.. Thanks for the solution

Thanks,
Abi
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Regular Expression
 
Similar Threads
regular expression in java question!
List to String
how to avoid NumberFormatException
Splitting a String
Split and other String functions/methods