I have a CSV file which I have to read through my code. Now if I read the file line by line and store each line as a String using the String.split(","); function, I face a problem that at some places the the data itself contains ",". So the split doesn't work properly.
Joined: Nov 20, 2003
The CSV file itself needs to have a way to distinguish a comma delimiter from a comma in the file. In my CSV the variables are enclosed by quotes "variable","Variable2","variable3".
Now using the regex with all it's capibilities (meaning I don't use it much, but I know it has lots of capibilities) then you should be able to isolate your strings.
First occurance of " as a boundary with "," then use "," as boundries until the last ".
I'm not the expert in Regex unless my back is up against the wall .
Hopefully this is helpfull.
Joined: Jan 29, 2003
Depending on the source, your CSV file may have quotes around all strings and no quotes around numbers, or quotes around strings that contain commas. Try to create some with quotes in the values, too, just to see how they are handled. Here's something I got from Excel:
It did not quote the number. It did not quote a simple string. It quoted a string with a comma in it. It quoted the string with quotes in it, doubled the quotes I entered, and warned me that it might not be DOS CSV compatible. Maybe you'll get lucky and won't have to implement that one!
So without the escaped quotes:
With the escaped quotes you'd take to the next quote, skip it and if there is another quote or a comma next. Lemme know if it's not clear how "skip" and "take" translate into Java substringing or array copying.
A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
Joined: Sep 10, 2002
I had several problems like this with split function. Try using StringTokenizer instead. It is more stable than split for these situations.