| Author |
Parsing text file into 3 columns
|
Vatsa dude
Greenhorn
Joined: Apr 29, 2009
Posts: 22
|
|
Hi,
I have a requirement to parse the text file (sample pasted below) and extract the 3 said columns in the file. I am using the Scanner class, but I cannot seem to leverage the "useDelimiter" method to trim the spaces in the text file..
Sample file
64.105.4.90 mail.virtuosoworks.com. ptr
64.105.4.178 mail.imaamd.org. ptr
64.105.4.186 smtp.vernonlaw.com. ptr
64.105.5.25 64-105-5-25.adsl.lbdsl.net. ptr
64.105.5.26 stitch.chipworks.net. ptr
64.105.5.27 studley.chipworks.net. ptr
64.105.5.28 heman.chipworks.net. ptr
64.105.5.29 xena.chipworks.net. ptr
64.105.9.133 MAIL.LINCOLNINDUSTRIAL.COM. ptr
64.105.9.137 DNS1.LINCOLNINDUSTRIAL.COM. ptr
Code below is extracting the 1st column (IP address), but not finding the 2nd and 3rd column. Any help is appreciated. ptr is the 3ed column - its sometimes 1 space after the 2nd column and sometimes multiple spaced after the 2nd column. This file is auto-generated by a mysterious script.
|
 |
Henry Wong
author
Sheriff
Joined: Sep 28, 2004
Posts: 16815
|
|
The default delimiter is "one or more whitespaces" -- so there is no need to set the delimiter, as it is correct. In fact, the delimiter that you set is exactly one space, and since you mention that there may be more than one space, is actually incorrect.
Henry
|
Books: Java Threads, 3rd Edition, Jini in a Nutshell, and Java Gems (contributor)
|
 |
Rob Spoor
Sheriff
Joined: Oct 27, 2005
Posts: 19232
|
|
|
aLine.split("\\s+") is also an option.
|
SCJP 1.4 - SCJP 6 - SCWCD 5
How To Ask Questions How To Answer Questions
|
 |
Adeel Ansari
Ranch Hand
Joined: Aug 15, 2004
Posts: 2874
|
|
Rob Prime wrote:aLine.split("\\s+") is also an option.
And this one will be more efficient in terms of performance.
|
 |
Rob Spoor
Sheriff
Joined: Oct 27, 2005
Posts: 19232
|
|
I wouldn't dare say that without some figures to back it up. Both Scanner and String.split use a java.util.regex.Pattern object. Both these pieces of code create this for each line.
It is likely though that String.split is more efficient since Scanner uses several Pattern objects.
You are right that both are not really efficient. The Pattern can be pulled out of the loop with String.split though:
I've searched the Scanner API but there is no way to reset the Scanner with new input. Therefore, the String.split way will be the more efficient.
|
 |
Adeel Ansari
Ranch Hand
Joined: Aug 15, 2004
Posts: 2874
|
|
Rob Prime wrote:I wouldn't dare say that without some figures to back it up...
Actually, I have benchmarked both few months ago, using File IO.. reading ... and then splitting strings based on some token. I tried both Scanner, and String's split. The latter seemed faster.
|
 |
Rob Spoor
Sheriff
Joined: Oct 27, 2005
Posts: 19232
|
|
Well, then you have some figures to back your claim
|
 |
 |
|
|
subject: Parsing text file into 3 columns
|
|
|