Win a copy of Design for the Mind this week in the Design forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

StringTokenizer to read CSV file

 
Ahamed Ali
Greenhorn
Posts: 19
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I am trying to convert CSV to XML file, and I am using StringTokenizer to
read CSV file, but not clear understandble.

---------------------------CSV File-------------------------------------
[Header]
'devon','add','"LR","UOP"','test'
[Identification]
'northdevon.ac.uk','123456','',''
[Personal]
'Smith','West','Carol','Mrs','31/12/1977','F','156 Old Mill Lane, Worcester','WO34 5GG','ER234557H','','Favourite food?'
------------------------------------------------------------------

Can any one suggest me about this implementation or any links to have a similar examples.

I tried with StrinTokenizer, but could get the proper results.
-------------------------------------------------------------------------
public int transformMethod(String csvFileName, String xmlFileName){
int rowsCount=-1;
try{
Document newDoc = domBuilder.newDocument();
Element rootElement = newDoc.createElement("learnerinformation");
newDoc.appendChild(rootElement);

// Read comma seperated file
BufferedReader csvReader;
csvReader = new BufferedReader(new FileReader(csvFileName));
int fieldCount = 0;
String[] csvFields = null;
StringTokenizer stringTokenizer = null;

// Assumption: first line in CSV file is column/field names
// As the column names are used to name the elements in the
XML file,
// avoid using spaces/any other characters not suitable for
XML element naming
String curLine = csvReader.readLine();
if(curLine != null)
{
stringTokenizer = new StringTokenizer(curLine,",");
fieldCount = stringTokenizer.countTokens();
if(fieldCount > 0)
{
csvFields = new String[fieldCount];
int i=0;
while(stringTokenizer.hasMoreElements())
csvFields[i++] = String.valueOf( stringTokenizer.nextElement());
}
}

// Now we know the columns, Let's now read data row lines
while((curLine = csvReader.readLine()) != null)
{
stringTokenizer = new StringTokenizer(curLine,",");
fieldCount = stringTokenizer.countTokens();

if(fieldCount > 0)
{
Element rowElement = newDoc.createElement("Contenentype");
//Element rowElement1 = newDoc.createElement("referential");

int i=0;
while(stringTokenizer.hasMoreElements())
{
try
{
String curValue = String.valueOf(stringTokenizer.nextElement());
Element curElement = newDoc.createElement(csvFields[i++]);
curElement.appendChild(newDoc.createTextNode(curValue));
rowElement.appendChild(curElement);
// rowElement1.appendChild(curElement);
}
catch(Exception exp)
{
}
}
rootElement.appendChild(rowElement);
rowsCount++;
}
}
csvReader.close();
}
----------------------------------------------------------------------

thanks in advanced.


Fyrose
 
Ryan McGuire
Ranch Hand
Posts: 1057
4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
This is crossposted in the I/O And Streams forum.
 
Alan Moore
Ranch Hand
Posts: 262
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
StringTokenizer is useless for parsing CSV data, as is String#split(). Google for "Java CSV parser".
 
D Rog
Ranch Hand
Posts: 472
Linux Objective C Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have a code for decent CSV tokenizer, if I found more than one person interested in then I'll move it to opensource and publish.
 
Ahamed Ali
Greenhorn
Posts: 19
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
thanks for everyone suggested me about the headche stuff.

Mr. D Rog could you please publish your CSV code in the open source, because I am very much interested to look the code and for help.

thanks in adavance and hope to see the CSV code in open source.


Ahamed
 
bparanj
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Parsing CSV files in 5 minutes or less!!!

CsvJdbc - an Open Source JDBC driver for CSV files Rocks!

I was looking for an utility which can read and parse the CSV files. The open source utility CsvJdbc was perfect for my requirements. It parses the given CSV file with no problem. Just include the csvjdbc.jar in the classpath and start using the API. The command line argument is C:\\test (Java requires the escape character \ for the \, which is the separator). The filename is anything.csv and is stored in the directory specified above. Version is Version 0.10.
The documentation says :
The driver also now supports scrollable result sets.
This following example code shows how these are used.
...

Connection conn = Drivermanager.getConnection("jdbc:relique:csv:" + args[0],props)

Statement stmt = conn.createStatement(ResultSet.TYPE_SCROLL_SENSITIVE, 0);

ResultSet results = stmt.executeQuery("SELECT ID,NAME FROM sample");

resulst.next();
...
results.first();
...
results.previous();
...
results.relative(2);
...
results.absolute(2);
...
But it looks like it is not implemented yet. I got : Oops-> java.lang.UnsupportedOperationException: ResultSet.relative() unsupported error message.

Here is the modified example program that I am using for my purposes:

/*
* Created on Dec 16, 2004
* @author BXParanj
*/
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.Statement;

public class RegistrationParser {
private String userName;
private String password;

public RegistrationParser(String fileName) {

}

public static void main(String[] args) {

String index = "fcgp001";
try
{
// load the driver into memory
Class.forName("org.relique.jdbc.csv.CsvDriver");

// create a connection. The first command line parameter is assumed to
// be the directory in which the .csv files are held
Connection conn = DriverManager.getConnection("jdbc:relique:csv:" + args[0] );

// create a Statement object to execute the query with
Statement stmt = conn.createStatement();

// Select the EBK_USERNAME and EBK_PASSWORD columns from lead.csv file

String sqlQuery = "SELECT EBK_USERNAME, EBK_PASSWORD FROM lead WHERE EBK_USERNAME =" + index ;
ResultSet results = stmt.executeQuery(sqlQuery);

// dump out the results
while (results.next())
{
System.out.println("EBK_USERNAME= " + results.getString("EBK_USERNAME") + " EBK_PASSWORD= " + results.getString("EBK_PASSWORD"));
}

// clean up
results.close();
stmt.close();
conn.close();
}
catch(Exception e)
{
System.out.println("Oops-> " + e);
}
}
}
 
bparanj
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Once you parse the CSV, you can convert the Java objects to xml using open source Castor or any other tools that will bind Java objects to XML.
 
Ahamed Ali
Greenhorn
Posts: 19
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hi,

thanks for your message.

At this moment I am looking a CSV file to parse into SAX then transform to XML output, simply CSV to XML using SAX, probably I will later to store the csv to databases.

I used StringTokenizer, but does not give much results for complex CSV files, and I also gone through open source CSV parser Ostill... for Java. It is just reading the complete CSV file and the giving the console, but I dont know how to parse into SAX for firing an event of CSV file line by line.

I hope SAX is the best choice to my requirements. Please someone could help me out about this task, and I need to complete ASAP.

thanks in advanced.


Ahamed
 
Ernest Friedman-Hill
author and iconoclast
Marshal
Pie
Posts: 24208
35
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
bparanj --

This is your last warning: fix your display name, or I'll lock your account.
 
D Rog
Ranch Hand
Posts: 472
Linux Objective C Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by bparanj:
Parsing CSV files in 5 minutes or less!!!

CsvJdbc - an Open Source JDBC driver for CSV files Rocks!

....
}

Right, it's a good solution, I used it long time ago and was very satisfied. My problem was that CSV stream sometimes wasn't in file system and I wanted to parse it on fly, without storing in a temporary file. Do you know how it handles CR in fields, for example:
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic