This week's book giveaway is in the General Computing forum.
We're giving away four copies of Arduino in Action and have Martin Evans, Joshua Noble, and Jordan Hochenbaum on-line!
See this thread for details.
The moose likes Java in General and the fly likes Parsing text file into 3 columns Big Moose Saloon
  Search | Java FAQ | Recent Topics
Register / Login


JavaRanch » Java Forums » Java » Java in General
Reply Bookmark "Parsing text file into 3 columns" Watch "Parsing text file into 3 columns" New topic
Author

Parsing text file into 3 columns

Vatsa dude
Greenhorn

Joined: Apr 29, 2009
Posts: 22
Hi,
I have a requirement to parse the text file (sample pasted below) and extract the 3 said columns in the file. I am using the Scanner class, but I cannot seem to leverage the "useDelimiter" method to trim the spaces in the text file..

Sample file

64.105.4.90 mail.virtuosoworks.com. ptr
64.105.4.178 mail.imaamd.org. ptr
64.105.4.186 smtp.vernonlaw.com. ptr
64.105.5.25 64-105-5-25.adsl.lbdsl.net. ptr
64.105.5.26 stitch.chipworks.net. ptr
64.105.5.27 studley.chipworks.net. ptr
64.105.5.28 heman.chipworks.net. ptr
64.105.5.29 xena.chipworks.net. ptr
64.105.9.133 MAIL.LINCOLNINDUSTRIAL.COM. ptr
64.105.9.137 DNS1.LINCOLNINDUSTRIAL.COM. ptr

Code below is extracting the 1st column (IP address), but not finding the 2nd and 3rd column. Any help is appreciated. ptr is the 3ed column - its sometimes 1 space after the 2nd column and sometimes multiple spaced after the 2nd column. This file is auto-generated by a mysterious script.

Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 16815
    
  19



The default delimiter is "one or more whitespaces" -- so there is no need to set the delimiter, as it is correct. In fact, the delimiter that you set is exactly one space, and since you mention that there may be more than one space, is actually incorrect.

Henry


Books: Java Threads, 3rd Edition, Jini in a Nutshell, and Java Gems (contributor)
Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19232

aLine.split("\\s+") is also an option.


SCJP 1.4 - SCJP 6 - SCWCD 5
How To Ask Questions How To Answer Questions
Adeel Ansari
Ranch Hand

Joined: Aug 15, 2004
Posts: 2874
Rob Prime wrote:aLine.split("\\s+") is also an option.


And this one will be more efficient in terms of performance.
Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19232

I wouldn't dare say that without some figures to back it up. Both Scanner and String.split use a java.util.regex.Pattern object. Both these pieces of code create this for each line.
It is likely though that String.split is more efficient since Scanner uses several Pattern objects.

You are right that both are not really efficient. The Pattern can be pulled out of the loop with String.split though:
I've searched the Scanner API but there is no way to reset the Scanner with new input. Therefore, the String.split way will be the more efficient.
Adeel Ansari
Ranch Hand

Joined: Aug 15, 2004
Posts: 2874
Rob Prime wrote:I wouldn't dare say that without some figures to back it up...


Actually, I have benchmarked both few months ago, using File IO.. reading ... and then splitting strings based on some token. I tried both Scanner, and String's split. The latter seemed faster.
Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19232

Well, then you have some figures to back your claim
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Parsing text file into 3 columns
 
Similar Threads
Read and Display server.log? Pls HELP
Issue with the scanner class.
How can I replace the line of file?
Scanner class
how to display a collection of file via jsp in Spring