aspose file tools*
The moose likes Beginning Java and the fly likes Counting chars during file read. Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of JavaScript Promises Essentials this week in the JavaScript forum!
JavaRanch » Java Forums » Java » Beginning Java
Bookmark "Counting chars during file read." Watch "Counting chars during file read." New topic
Author

Counting chars during file read.

Gary Goldsmith
Ranch Hand

Joined: Mar 06, 2007
Posts: 30
I would like some help on 2 issues with my program the first is counting the chars of each line and the second issue is recording the frequency of how many times each letter appers in the file. I want to store the number of times each letter of the alphabet appears in the txt file: A=12 B=9 .... Z=1 but I'm not sure how to do that.



Above is my code, so my main problem is understanding how to store the number of times each letter appears. Any ideas would be much appreciated.
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
Well, is there a problem with the code you have now? Have you looked at the results? Is it not doing what you expect it to?

One thing to consider: how was the number 128 chosen? Is there any possibility that a char will have a value greater than 128? What would happen if it did?


"I'm not back." - Bill Harding, Twister
Gary Goldsmith
Ranch Hand

Joined: Mar 06, 2007
Posts: 30
The code works to an extent. It reads the file ok but I can't get it to count each letter and add it to some sort of array. The line int[] chars = new int[128]; was just really left over from testing ideas out.
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
[Gary]: I can't get it to count each letter and add it to some sort of array.

How do you know that? It looks to me like that code would work, at least for common cases, or throw an error (with error message) otherwise. So, do you get an error message? Do you have some way of looking at the array to check the results?
Gary Goldsmith
Ranch Hand

Joined: Mar 06, 2007
Posts: 30
I don't have an array yet as I'm unsure how to implement it, can an array be structured A=0, B=0 C=0 etc?, and when the while loop is going round increment each letter as they are read.

The code doesn't come up with any errors during runtime but I can't get it to print out each letter in turn (pointless but for the purpose of testing to see if its working). So I'm not too sure it is reading each char.
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
I'm going to take this a little bit out of order:

[Gary]: I don't have an array yet as I'm unsure how to implement it

But you do have an array, right there in the code. If you could see what's getting stored in it, you could find out how close your results are to what you expect.

[Gary]: The code doesn't come up with any errors during runtime but I can't get it to print out each letter in turn (pointless but for the purpose of testing to see if its working). So I'm not too sure it is reading each char.

Good. Trying to print out intermediate reults is an excellent way to find out what's going on. And, indeed, printing each letter as it's read is a good step here. How did you try to do that? Did you succeed in printing something? What did it look like?

Another good idea would be, after you've read all the chars, write some code to loop through the whole array and print out what value is stored at each position in the array. Then you could see how your current results compare to what you expect. Can you write a loop that will go through each element in the array? Can you put a print statement in the loop to see a result?

[Gary]: can an array be structured A=0, B=0 C=0 etc?

Yes - that's one of the two basic options you have available, and it's the one EFH was describing to you here. However it's not the only way. In Java, the character 'A' is not 0, but it does have a numeric value, which happens to be 65. (See this ASCII table for more info.) If you want A to be converted to 0, the easiest way is to simply subtract 65. Then 'B' has value 66, and if you subtract 65 you get 1. Likewise 'C' becomes 3, etc.

Now if you write

in your code, people may wonder what's special about 65. You could put in a comment if you like, explaining it. Or you could take advantage of the fact that 'A' has value 65:

This has the same effect, but now it's clearer where the 65 came from. You don't even have to remember what 'A' is 65. It's just some number. So if ch is 'C', then ch - 'A' will be 67 - 65, which is 2 - just what you want for 'C'.

I suggest revisting EFH's previous post for more info here.

The other option is to do what you're currently doing - just take the value of charAt(i) and use it as an array index directly. Then the count of A will be stored at position 65, and B at 66, etc. Maybe you'll want to ignore those previous 65 values (including 0), or maybe you'll find a use for them.

Your array size of 128 is still big enough to handle all the "normal" US-ASCII characters, but you may run into trouble with more "exotic" European characters like ä - and certainly Asian languages won't fit in here at all. If that's an issue, you'll have to use a much bigger array. Note that this is something to think about even with the earlier option of making A = 0 etc. But I suspect it's not something you need to worry about here - certainly not at first, anyway. Just get it working for simple ASCII characters first, and then modify the code later to handle other things - if you need to.
Gary Goldsmith
Ranch Hand

Joined: Mar 06, 2007
Posts: 30
Thank you Jim.

I have now added the results to an ArrayList.



So my last question is how do I find the most reoccurring number in the array list, ignoring the minus number (e.g. -33 which is white space).
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
Um, OK. What does the ArrayList accomplish for you?

If you want to count each character, that int[] array that you took out seemed a good way to do it. The line

chars[result.charAt(x)]++;

takes the previous count for a particular character and increments it. The only problem, if it is a problem, is that aharacter 'A' is represented by 65, and so on. So you may want to use

chars[result.charAt(x) - 'A']++;

instead. And you way want to put in additional logic to whck to see if the character is outside the normal range for letters (e.g. numbers or punctuation may give an ArrayIndexOutOfBoundsException if you don't check beforehand). Or you may want to use Character.toUpperCase() (or toLowerCase()) to ensure that 'A' and 'a' are counted as the same character.

But regardless of these additional refinements, the point is that the int[] array chars was a good way to count these characters, and by taking it out, you're moving away from a solution, not towards one.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Counting chars during file read.