I'm scanning a word file and creating Word objects out of each word in the file. I added a delimiter to my scanner because the 'rules' for what is considered to be a word are different. I got it to work and I have an arraylist of Word objects.
My problem has to do with what I need to do next. I need to rescan the file, and whenever I encounter one of the words from my list of words in the text file, I need to store the line number and paragraph number(occurance of the word) in the word object. I have a method to do that.
To be more particular, I can't figure out how to set up my counts to count the line number and paragraph number. Both are meant to start at 1, but the line number needs to reset back to 1 each time I get to a new paragraph. Paragraphs are separated by one or more blank lines. Also, the text file is UNIX-format.
Here is my code which first scans in all the words and adds them to the list, then removes the duplicates:
Here is my flawed code: (where i try to add the paragraph-line pairs to each word)
I think doing 2 separate scans and trying to match the words back to the line numbers on the second run through is the wrong approach, and is prone to error. You need to add the paragraph and line information in when you first scan it. Do this by using Scanner.nextLine() so that you know which line you're on, then running another scanner to match words on that line.
Also, an easier way of getting unique values is by sticking your values in a HashSet rather than an ArrayList. Just make sure you're overriding hashCode() and equals() in your Word class, so that the hashCode of 2 equal words is the same, so that HashSet can recognize duplicates.
What about scanning line by line with the nextLine() method? Then you can count the lines. Split each line into individual words with the String#split method. Whenever you get a line whose length (after trimming) is 0 (or use the String#isEmpty method) you increment your paragraph count and reset your line count probably to 0.