Granny's Programming Pearls
"inside of every large program is a small program struggling to get out"
The moose likes Java in General and the fly likes Parsing Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Parsing" Watch "Parsing" New topic


Chris Cairns
Ranch Hand

Joined: Jan 31, 2003
Posts: 48
Any help would be appreciated.
I have several rows of data I'm trying to parse and then insert into a database.

My approach to it has been this. I read the data, char by char, from the data file into a StringBuffer and convert that to a String. Now I have the all the text data as one huge String object. First, I split the String into an array of rows. (Each row is tabbed delimited.) Second, I split the rows into an array of fields. (All fields are tabbed delimited.) (The index number of the fields array corresponds exactly to its column position. So I simple just assign that position to a variable which I will later insert into the database.) Here's the thing. In the text data, not all field values exist. When I assign a field element to a variable if this is the case, nothing is assigned. I tried to print out the bytes to determine what character is there, but no bytes even print out. I even checked for null. I assume this has to do something with the split. I can't check use condition to determine whether or not the field has a value or not. I'm screwed because I won't be able to insert nulls. Any suggetions or do I need to clarify more?
Stan James
(instanceof Sidekick)
Ranch Hand

Joined: Jan 29, 2003
Posts: 8791
Looks like split gives you an empty string when you have two delimiters together. Exactly like
String myString = "";
You should be able to insert empty strings into char fields in your database. Of course, you'll need special checks to convert them to numbers and other types.
A few thoughts on your algorithm. Can you switch to a buffered reader and read one line at a time? As it is, you read the whole file into a StringBuffer, the copy that to a String, then copy that to an array. Speed may degrade as the file grows, and a really big file could blow you up.
Besides, "readLine()" communicates your purpose to a human reader more easily "than read all the bytes into a buffer, convert to string, split into an array of lines, process one line at a time"
Hope that helps!

A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
Jim Yingst

Joined: Jan 30, 2000
Posts: 18671
Your code doesn't compile - that's probably a clue of some sort.
Are you interested in all the rows, or just the last one? The fact that you define all the variables outside any loop suggests that you plan to use these values later outside the loop - at this point, they retain the last values they had inside the loop. It wouldn't surprise me if there's an extra \n at the ens, and you're just looking at a final line of blanks.
If you are interested in values other than the last row, you'll probably want to put some code inside the loop, right? And why are all those variables declared outside the loop?
Insert a few checks and debug print statements as you go through the rows. E.g.

This sort of thing can help you figure out what's actually happening.

"I'm not back." - Bill Harding, Twister
I agree. Here's the link:
subject: Parsing
It's not a secret anymore!