aspose file tools*
The moose likes Beginning Java and the fly likes Concordancer Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of EJB 3 in Action this week in the EJB and other Java EE Technologies forum!
JavaRanch » Java Forums » Java » Beginning Java
Bookmark "Concordancer" Watch "Concordancer" New topic
Author

Concordancer

Tom Warner
Greenhorn

Joined: Oct 19, 2003
Posts: 7
I am trying to make a simple concordancer to count words in a text file. Its supposed to keep track of how many times each word is used and output it the results to a textArea. For example:
Word Count
---- -----
The 7
and 34
So far Ive gotten it to break apart the text file into words but im having difficulty putting the words into an array and checking if the word has already been used before. Any suggestions would be very helpful.
Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24166
    
  30

One suggestion: write this as at least two classes. One to do the word counting, and one to present the results in a GUI. Smaller, more focused classes are easier to write, to understand, to test, and to debug.
An array isn't the best choice of data structure. A better one might be a Map, using the words for keys, which makes looking up the words and finding if they've been used quick and easy.
Breaking the string into words yourself, again, isn't the best choice, when you can use either java.util.StringTokenizer or java.io.StreamTokenizer to do it for you.
If this is an assignment, where you are required to do this a certain way, then let us know; otherwise, I can give you more advice regarding the above.


[Jess in Action][AskingGoodQuestions]
Tom Warner
Greenhorn

Joined: Oct 19, 2003
Posts: 7
This is an assignment and I think it can be done any way I want but I've never used maps and my instructor suggested doing it the way I was. If maps would be easier than arrays I'll try it that way. Im also supposed to use a bubble sort after to list the words starting with the most reoccuring, would that be easy to do with a map? Thank you for your suggestions Ernest.
Tom Warner
Greenhorn

Joined: Oct 19, 2003
Posts: 7
I tried using a map but I cant set one set up properly. I checked the API at Sun's site but I still get errors. Can someone point me in the right direction?
Tom Blough
Ranch Hand

Joined: Jul 31, 2003
Posts: 263
Tom,
Instead of using Map directly, try using a Hashtable which implements Map.
A Hashtable maps keys to values. The key would be the word, and the value would be the count of the occurances of the word.
Use a StringTokenizer to break your text into words, then for each word, check if it currently exists in the hashtable, if it does, get its current value, increment it, and save it back in the table.
If it does not exist, create a new has table entry with the value set to 1.


Tom Blough<br /> <blockquote><font size="1" face="Verdana, Arial">quote:</font><hr>Cum catapultae proscriptae erunt tum soli proscripti catapultas habebunt.<hr></blockquote>
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Concordancer
 
Similar Threads
Need a GUI
Please help on Strings
Writing a faster String split.
counting specific words from a text file
plz correct code