• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Reading A CSV File

 
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello,

I have a CSV file which I have to read through my code. Now if I read the file line by line and store each line as a String[] using the String.split(","); function, I face a problem that at some places the the data itself contains ",". So the split doesn't work properly.

Any Suggestions???

Shirish
 
Ranch Hand
Posts: 97
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

The CSV file itself needs to have a way to distinguish a comma delimiter from a comma in the file. In my CSV the variables are enclosed by quotes "variable","Variable2","variable3".

Now using the regex with all it's capibilities (meaning I don't use it much, but I know it has lots of capibilities) then you should be able to isolate your strings.

First occurance of " as a boundary with "," then use "," as boundries until the last ".

I'm not the expert in Regex unless my back is up against the wall .


Hopefully this is helpfull.
 
(instanceof Sidekick)
Posts: 8791
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Depending on the source, your CSV file may have quotes around all strings and no quotes around numbers, or quotes around strings that contain commas. Try to create some with quotes in the values, too, just to see how they are handled. Here's something I got from Excel:

It did not quote the number. It did not quote a simple string. It quoted a string with a comma in it. It quoted the string with quotes in it, doubled the quotes I entered, and warned me that it might not be DOS CSV compatible. Maybe you'll get lucky and won't have to implement that one!

So without the escaped quotes:

With the escaped quotes you'd take to the next quote, skip it and if there is another quote or a comma next. Lemme know if it's not clear how "skip" and "take" translate into Java substringing or array copying.
 
Ranch Hand
Posts: 127
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I had several problems like this with split function.
Try using StringTokenizer instead.
It is more stable than split for these situations.
 
reply
    Bookmark Topic Watch Topic
  • New Topic