File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Parsing

 
Chris Cairns
Ranch Hand
Posts: 48
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Any help would be appreciated.
I have several rows of data I'm trying to parse and then insert into a database.

My approach to it has been this. I read the data, char by char, from the data file into a StringBuffer and convert that to a String. Now I have the all the text data as one huge String object. First, I split the String into an array of rows. (Each row is tabbed delimited.) Second, I split the rows into an array of fields. (All fields are tabbed delimited.) (The index number of the fields array corresponds exactly to its column position. So I simple just assign that position to a variable which I will later insert into the database.) Here's the thing. In the text data, not all field values exist. When I assign a field element to a variable if this is the case, nothing is assigned. I tried to print out the bytes to determine what character is there, but no bytes even print out. I even checked for null. I assume this has to do something with the split. I can't check use condition to determine whether or not the field has a value or not. I'm screwed because I won't be able to insert nulls. Any suggetions or do I need to clarify more?
 
Stan James
(instanceof Sidekick)
Ranch Hand
Posts: 8791
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Looks like split gives you an empty string when you have two delimiters together. Exactly like
String myString = "";
You should be able to insert empty strings into char fields in your database. Of course, you'll need special checks to convert them to numbers and other types.
A few thoughts on your algorithm. Can you switch to a buffered reader and read one line at a time? As it is, you read the whole file into a StringBuffer, the copy that to a String, then copy that to an array. Speed may degrade as the file grows, and a really big file could blow you up.
Besides, "readLine()" communicates your purpose to a human reader more easily "than read all the bytes into a buffer, convert to string, split into an array of lines, process one line at a time"
Hope that helps!
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Your code doesn't compile - that's probably a clue of some sort.
Are you interested in all the rows, or just the last one? The fact that you define all the variables outside any loop suggests that you plan to use these values later outside the loop - at this point, they retain the last values they had inside the loop. It wouldn't surprise me if there's an extra \n at the ens, and you're just looking at a final line of blanks.
If you are interested in values other than the last row, you'll probably want to put some code inside the loop, right? And why are all those variables declared outside the loop?
Insert a few checks and debug print statements as you go through the rows. E.g.

This sort of thing can help you figure out what's actually happening.
 
I agree. Here's the link: http://aspose.com/file-tools
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic