This week's book giveaway is in the OCMJEA forum.
We're giving away four copies of OCM Java EE 6 Enterprise Architect Exam Guide and have Paul Allen & Joseph Bambara on-line!
See this thread for details.
The moose likes Beginning Java and the fly likes HashMap and Vector Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of OCM Java EE 6 Enterprise Architect Exam Guide this week in the OCMJEA forum!
JavaRanch » Java Forums » Java » Beginning Java
Bookmark "HashMap and Vector" Watch "HashMap and Vector" New topic
Author

HashMap and Vector

Michael Duff
Greenhorn

Joined: Mar 13, 2010
Posts: 18
Hello

I am in progress to write a program that analyzes a text-input. It should print out the letters, occurence of a letter and its percentual value of all letters in the text ("abba" would mean a and b have 50% each). Now I have decided to use HashMap where the letter would be the key, and the value would be a Vector holding the count of the letter and its percentage (is that even possible in a single Vector?). But I'm kinda stuck in the implementation phase, I can't see what I'm doing wrong.

When I print out my HashMap it returns the letters (in scrambled order), and the vector which is same for all letters (what I want is just the letter, its count and a percentage). Here is what I have so far:


[edit - i moved the comment on line 22 up one line to prevent one REALLY long line]
Rudradutt Joshi
Ranch Hand

Joined: Dec 06, 2008
Posts: 45

Hi,

I am not getting your precise point, but for correct results you should use double as holder of length.

i.e.
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 38519
    
  23
That doesn't look at all good. You really need to write down on a sheet of paper what you are going to do. It needs to be in words of one syllable. Then, from that you can change the algorithm into code.

You don't need to count entries if you insist on using a List, because a List has a very easy method of counting. You can find the details in the documentation.
Why are you using Vector in the first place? Use ArrayList.
Your generics looks all confused to me; you are going on about <Character> and inserting Integers.
What does the line about setElementAt do? I find it incomprehensible.
Don't go mixing chars and number literals. Where do you get 26 from? Are you sure you have written it correctly? Why don't you write small a and small z? What about 97? Is that supposed to be small a? If so, write small a.
There are methods in the Character class which allow you to test whether something is a bona fide letter.

I am afraid if it were me, I would go back to the sheet of paper and start from scratch.
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 38519
    
  23
rudradutt joshi wrote: . . . for correct results you should use double as holder of length.

i.e.
Nonsense. It's an int. that is about the one bit that is correct. A double can introduce imprecision, and cause the loop to malfunction.

By the way: you can get an array from a String easily; it's all in the documentation.
Michael Duff
Greenhorn

Joined: Mar 13, 2010
Posts: 18
Okay I have figured out a more clearer version of my lettercounter


Now I would like to make the <Integer> in the HashMap into an ArrayList<Integer>, is that possible? If so, how would the incrementing of an existing character work?
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 38519
    
  23
That looks a lot better. You can tell it's better just by gazing at it, even without reading the code, because the shape looks better.

You appear to have two copies of the char, one form the String directly and the other from the array. You don't need both.
I would also suggest this changeI am not sure I understand the bit about a List. How would a List help with counting?

There is another way to do it, with an int[]; you use the char to find the index.
Michael Duff
Greenhorn

Joined: Mar 13, 2010
Posts: 18
Campbell Ritchie wrote:
I am not sure I understand the bit about a List. How would a List help with counting?


The thing about a List is, in the end, I would like to have a structure where HashMap has information about a certain character, how many times it has appeared in the input text and how much % of the input text that letter represents. I'm just having a hard time figuring out how to do it .

Thanks for all the help so far.
fred rosenberger
lowercase baba
Bartender

Joined: Oct 02, 2003
Posts: 11257
    
  16

You're going to have problems keeping the percentages up to date.

Let's say i start reading the string "ABCDE"

If i first insert the 'A', it will have a count of 1 and be 100%.

Then I insert the 'B'. It needs to have a count of 1. Its percentage will be 50% - but now I have to go back an update the percentage for 'A'.

Then I insert 'C'. It needs to have a count of 1. Its percentage should be 33%, and now I have to update A's and B's as well...

If you have a really long string, eventually you will have to update all 26 records each time you insert anything.


Generally speaking, I prefer to NOT store data that can be calculated off other data I'm already storing. Since I should know (or be able to keep track of) how many total characters I've entered, and I also can easily get the number of times 'Q' has appeared, it's trivial to calculate the percentage when I need it. It's also safer, because you don't have to worry about the data getting out of sync with itself.


There are only two hard things in computer science: cache invalidation, naming things, and off-by-one errors
kri shan
Ranch Hand

Joined: Apr 08, 2004
Posts: 1372
Create the HashMap with key as letter and counter as value for each letter.
Iterate the character array and add the key and value to HashMap if character not exists in HashMap.
if character exists in HashMap, just increment the counter value and update the HashMap key(character).

Finally Iterate Hashmap based on key and values for finding the percentage.

Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 38519
    
  23
That's more or less what he has already done.

I think Fred's recommendations about counts are better. Just beware of the vagaries of integer division.
Michael Duff
Greenhorn

Joined: Mar 13, 2010
Posts: 18
I found out how to get the percentages of each letter:


Now, if I would want to use those percentages for comparison would it be better to store them somewhere? (NOTE: I only need to analyze a text once).
Rudradutt Joshi
Ranch Hand

Joined: Dec 06, 2008
Posts: 45

Campbell Ritchie wrote:
rudradutt joshi wrote: . . . for correct results you should use double as holder of length.

i.e.
Nonsense. It's an int. that is about the one bit that is correct. A double can introduce imprecision, and cause the loop to malfunction.

Thanks Sheriff for your lovely comments.
I had just provided a work around for calculating correct percentage to enable below code in correct manner.


I am not able to understand byte thing as 1 = 1.000000000.
What I understood is Anything it has to do with precision apart from converting
((Integer) letterCount.elementAt(ch)) / textLength)
to double is increase precision.

I think below snippet will be of some help to the owner of the thread



Regards,
Rudradutt
Michael Duff
Greenhorn

Joined: Mar 13, 2010
Posts: 18
Thank you for that code, I had something similar planned at one stage. But when I try to implement that code I stumbled upon the problem that Fred Rosenberger mentioned earlier, ie. the letters get updated all the time and the Map becomes a mess when you try to print it out.
Rudradutt Joshi
Ranch Hand

Joined: Dec 06, 2008
Posts: 45

I hguess you will not encounter with the problem Fred mentioned with the above notion, where the dataMap is reiterated once after string parsing completes, so its one time operation in the end.

And you can do all the beautification for displaying data by enhancing toString for the wrapper object, and to further by a method to print map object the way you like.
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 38519
    
  23
Rudradutt Joshi wrote: . . . I had just provided a work around for calculating correct percentage to enable below code in correct manner.
. . .
Please explain why that won't work. And it is unnecessarily complicated.

I preferred Michael Duff's solution. I have a solution of my own, where I multiplied the count by 100 and 1000 and divided twice to get % and 0.1% values. Do you really want the length of the string as the denominator of the division?
Michael Duff
Greenhorn

Joined: Mar 13, 2010
Posts: 18
Rudradutt Joshi wrote:I guess you will not encounter with the problem Fred mentioned with the above notion, where the dataMap is reiterated once after string parsing completes, so its one time operation in the end.

And you can do all the beautification for displaying data by enhancing toString for the wrapper object, and to further by a method to print map object the way you like.

I'm not sure I quite understand what you said. I tested your code myself, and when I print the HashMap it says, for example, if I write "abba":

a = [Percent = 25.0, occurrence = 1]

b = [Percent = 25.0, occurrence = 1]

a = [Percent = 25.0, occurrence = 1]

b = [Percent = 50.0, occurrence = 2]

a = [Percent = 25.0, occurrence = 1]

b = [Percent = 50.0, occurrence = 2]

a = [Percent = 50.0, occurrence = 2]

Wouldn't that mean I store a lot of unnecessary information in the HashMap? It would be a nightmare if I needed that information for something. I would only like two things to be stored, namely a = [Percent = 50.0, occurence 2] and b = [Percent = 50.0, occurrence = 2]. Is it possible to do that with the code you provided?

EDIT: Okay I made a silly mistake, I printed the HashMap inside a for loop that's why I got the information I mentioned above! I assume after each iteration it wrote information about what's in the HashMap at that specific time.
Rudradutt Joshi
Ranch Hand

Joined: Dec 06, 2008
Posts: 45

Hey Ritchie,

Campbell Ritchie wrote:
Rudradutt Joshi wrote: . . . I had just provided a work around for calculating correct percentage to enable below code in correct manner.
. . .
Please explain why that won't work. And it is unnecessarily complicated.

I preferred Michael Duff's solution. I have a solution of my own, where I multiplied the count by 100 and 1000 and divided twice to get % and 0.1% values. Do you really want the length of the string as the denominator of the division?



I came to that conclusion because
(((Integer) letterCount.elementAt(ch)) / textLength)
returns integer. And will result in 0 in general cases.

And Duff, the map returned by analyze method will be printed once the execution of the method completes.
So calling method will be printing the correct map.

Tell me if I am missing something.

Regards,
Rudradutt
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 38519
    
  23
Rudradutt Joshi wrote: . . . returns integer. And will result in 0 in general cases. . . .
No, you aren't missing anything. But you need to think through the entire process. What number will be calculated and displayed?
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: HashMap and Vector