aspose file tools
The moose likes Java in General and the fly likes Scanning a text file Big Moose Saloon
  Search | Java FAQ | Recent Topics
Register / Login


Win a copy of The Mikado Method this week in the Agile and other Processes forum!
JavaRanch » Java Forums » Java » Java in General
Reply Bookmark "Scanning a text file" Watch "Scanning a text file" New topic
Author

Scanning a text file

Michael Wassack
Greenhorn

Joined: Apr 28, 2011
Posts: 2
Hello,

I'm scanning a word file and creating Word objects out of each word in the file. I added a delimiter to my scanner because the 'rules' for what is considered to be a word are different. I got it to work and I have an arraylist of Word objects.

My problem has to do with what I need to do next. I need to rescan the file, and whenever I encounter one of the words from my list of words in the text file, I need to store the line number and paragraph number(occurance of the word) in the word object. I have a method to do that.

To be more particular, I can't figure out how to set up my counts to count the line number and paragraph number. Both are meant to start at 1, but the line number needs to reset back to 1 each time I get to a new paragraph. Paragraphs are separated by one or more blank lines. Also, the text file is UNIX-format.

Here is my code which first scans in all the words and adds them to the list, then removes the duplicates:



Here is my flawed code: (where i try to add the paragraph-line pairs to each word)


Any help with this would be greatly appreciated!

Thank you!
Luigi Plinge
Ranch Hand

Joined: Jan 06, 2011
Posts: 441

I think doing 2 separate scans and trying to match the words back to the line numbers on the second run through is the wrong approach, and is prone to error. You need to add the paragraph and line information in when you first scan it. Do this by using Scanner.nextLine() so that you know which line you're on, then running another scanner to match words on that line.

Also, an easier way of getting unique values is by sticking your values in a HashSet rather than an ArrayList. Just make sure you're overriding hashCode() and equals() in your Word class, so that the hashCode of 2 equal words is the same, so that HashSet can recognize duplicates.
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 32708
    
    4
Welcome to the Ranch

What about scanning line by line with the nextLine() method? Then you can count the lines. Split each line into individual words with the String#split method. Whenever you get a line whose length (after trimming) is 0 (or use the String#isEmpty method) you increment your paragraph count and reset your line count probably to 0.
Michael Wassack
Greenhorn

Joined: Apr 28, 2011
Posts: 2
Thank you both! I'll have at it.
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12269
    
    1
Note that the java.io.LineNumberReader could track line numbers for you while providing a readLine method.

Bill

Java Resources at www.wbrogden.com
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 32708
    
    4
I never knew about a line number reader. Thank you.
 
I agree. Here's the link: http://ej-technologies/jprofiler - if it wasn't for jprofiler, we would need to run our stuff on 16 servers instead of 3.
 
subject: Scanning a text file
 
Similar Threads
Read data from a file into a 2D array
LinkedList of objects
writing unicode to a file
Count the number of vowels, words, and sentences.
Using Swing to display an array from text file issues