aspose file tools*
The moose likes Java in General and the fly likes Parsing and Insert Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Parsing and Insert" Watch "Parsing and Insert" New topic
Author

Parsing and Insert

Chris Cairns
Ranch Hand

Joined: Jan 31, 2003
Posts: 48
Problem: I need a more precise solution to parsing and insterting data into a database.
I'm parsing data in data file that has been exported from a database. Each field is tab delimited.


At first, I thought I could replace the tab character with a comma. Which would allow me to easily insert the record into the database. But what I found was that some of the field values actually contained a comma. This meant that in some cases, the the number of fields in a record would be larger than the number of columns. The long way around this problem I found was to replace the tabs with a colon. I won't get into the details of this. My co-worker tells me the shorter method is to replace the tabs with ',' so it'd be ready to be inserted immediately. However, parsing is done char by char and that sequence of characters is a string.

Any suggestions or are you totally lost by now?
Jamie Robertson
Ranch Hand

Joined: Jul 09, 2001
Posts: 1879

This is what I created my DelimReader class for. It reads each line of a text file in as an array of Strings. Anyways, here is the source code, not sure if the comments are up to date.
[code]import java.io.*;
/**
* A BufferedReader that also has the ability to read a delimited line in as a String[] of fields.
* This is useful when reading a delimited file into a program as a set of fields
* <code>
* <pre>
*//code reads in a file, and prints out each token of each line separated by a space
*DelimReader ft = new DelimReader( "C:\\temp\\file.txt" );
*String[] fields = ft.readLineTokens( "\t" );
*
*while( line != null )
*{
* System.out.print( "\n" );
*for( int i = 0; i < line.length; i++ )
*{
*System.out.print( " " + line[i] );
*}
*fields = ft.readLine( "\t" );
*}
* </pre>
* </code>
*/
public class DelimReader extends BufferedReader
{

/**
* Create a tokenizer to parse delimited values as fields from
* an file.
* <p>
*
* @param fully qualified name of the file to parse.
* @throws IOException if an error occurs while opening the input file
*/

public DelimReader( String inputFileName ) throws IOException
{
super( new FileReader( inputFileName ) );
}

/**
* Get all the delimited fields from the next line of the file.
* <p>
* if there is no characters between two delimiters, the value
* of the field is "". The value acts as a placeholder for the column
* <p>
* @return all the delimited values from the line or null if at end-of-file.
* @throws IOException if an error occurs while reading
*/
public String[] readLine( String delimiter ) throws IOException
{
String[] tokens = null;
String line = this.readLine();
if ( line != null ) //EOF condition
{
tokens = line.trim().split( delimiter );
}
line = null;
return tokens;
}
}
Jamie Robertson
Ranch Hand

Joined: Jul 09, 2001
Posts: 1879

Oh yeah, you need to be using jdk 1.4 for the regex functionality. If you are using < 1.4 then it wouldn't be hard to use StringTokenizer or such.
Jamie
Stan James
(instanceof Sidekick)
Ranch Hand

Joined: Jan 29, 2003
Posts: 8791
A tangential tip: When making delimited strings, put the delimiter as the first character. Then the generator and parser do not have to agree on a delimiter ahead of time. All kinds of cool things follow. Frinstance you can nest strings with different delimiters.
I didn't invent this, but I have used it many times. I have an arrayToString method that pre-scans the array to find a character that is not in any array element and uses it for the delimiter.
Sadly this only works when you can influence the generator and the parser. Won't help you build a file for import into Excel or something you don't control.


A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
Chris Cairns
Ranch Hand

Joined: Jan 31, 2003
Posts: 48
Thanks! I'll try it out.
 
 
subject: Parsing and Insert