aspose file tools*
The moose likes Java in General and the fly likes Ideal Collection implementation to be used for counting the occurance of a word in a file. Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "Ideal Collection implementation to be used for counting the occurance of a word in a file." Watch "Ideal Collection implementation to be used for counting the occurance of a word in a file." New topic
Author

Ideal Collection implementation to be used for counting the occurance of a word in a file.

Rajkamal Pillai
Ranch Hand

Joined: Mar 02, 2005
Posts: 443
    
    1

Hello,

I was asked the following question during the phase of an interview.

There is a text file which stores different words. I need to find the first three words with the most number of occurrences. I first thought that I would use a map with each word as the key and the number of occurrences as the value. If any word was already present in the map I would increment the value (count). Then I realized I would require an iteration through the map keys to find the keys with the highest value associated with it. Further if I stored them in a SortedMap implementation (e.g. TreeMap) how would I be able to sort them on the values and not the keys?

If I were to have a Comparator implementation which sorts a Map based on the values instead of the keys and I pass this Comparator while creating the new instance of the TreeMap, would the Map be sorted on the values?

Cheers,
Raj.
Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19718
    
  20

No it wouldn't. In maps the keys cannot change (in regards to equals/hashCode or compareTo for TreeMap). In your example the same applies to the values. If you change a value, the result of compare will change and the order will effectively change. But because the internal structure is not updated objects can never be found anymore.

You have two distinct problems here:
1) collect the counts
2) collect the top three

You already have part 1 finished, so all you need is part two. You could use the iteration, or you could copy all Map.Entry objects into a new SortedSet. Then take the first three elements. Take care though - what will you do if two words have the same number of occurrences? Your comparison should probably also use the word itself, if the two words occur equally often.


SCJP 1.4 - SCJP 6 - SCWCD 5 - OCEEJBD 6
How To Ask Questions How To Answer Questions
 
jQuery in Action, 2nd edition
 
subject: Ideal Collection implementation to be used for counting the occurance of a word in a file.