aspose file tools*
The moose likes Beginning Java and the fly likes Frequency of elements in Array Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Beginning Java
Bookmark "Frequency of elements in Array" Watch "Frequency of elements in Array" New topic
Author

Frequency of elements in Array

Conrad McLaughlin
Greenhorn

Joined: Jan 31, 2006
Posts: 27
For my project i need to do document indexing. I have already converted a text file to string array. But how do i do create a new array showing frequency of duplicate elements?


HOW DO I GO FROM THIS:

Doc [0] = "yellow"
Doc [1] = "red"
Doc [2] = "yellow"
Doc [3] = "yellow"
Doc [4] = "blue"


TO SOMETHING LIKE THIS:

DocNew [0] = "yellow" . Freq [0] = 3
DocNew [1] = "red" . Freq [1] = 1
DocNew [2] = "blue" . Freq [2] = 1


I am guessing I need a new 'frequency' array. How do I do this? I am guessing a loop of some sort but I have no idea.

Thanks.

Gerardo Tasistro
Ranch Hand

Joined: Feb 08, 2005
Posts: 362
You might want to look into a Tree structure. Maybe a TreeMap. Have the key be the word and the data be the counter. TreeMaps are very fast at finding elements. So you can test for the presence of the key (word). If it exists you increment the value by 1. If it doesn't you insert the key with a value of 1 associated to it.
Jesper de Jong
Java Cowboy
Saloon Keeper

Joined: Aug 16, 2005
Posts: 14111
    
  16

This page on autoboxing has a useful example.


Java Beginners FAQ - JavaRanch SCJP FAQ - The Java Tutorial - Java SE 7 API documentation
Scala Notes - My blog about Scala
Conrad McLaughlin
Greenhorn

Joined: Jan 31, 2006
Posts: 27
Thanks. The aim of the Java program is to create a simple information retrieval program where the database consists of text files (say 30 small text files).

The user types in the word and i output the documents where the word is located (in the order of frequency of word, etc.)

Thats why i needed this as part of the program.
Gerardo Tasistro
Ranch Hand

Joined: Feb 08, 2005
Posts: 362
Well I've used MySQL's full text search for that. Great feature. What database are you using? Maybe you can put all that logic at the dbase level.
Conrad McLaughlin
Greenhorn

Joined: Jan 31, 2006
Posts: 27
Sorry I say 'database' but I don't mean an actual database.

The program simple overview
----------------------------
1)The 'database' is a collection of text documents.

2)User enters a search term.

3) All Documents should go through index process : Index term, document, frequency (eg. [hello 2 3] means : hello appears in document 2, three times.)

4)Document is now indexed into array(s).

5)Output of relevant document depending on frequency of terms.


* QUESTION. How to take 1 array of string and create 2 arrays of string and frequency?

SO FOR EXAMPLE STRING ARRAY (d):

d[0] = "blue"
d[1] = "blue"
d[2] = "blue"
d[3] = "red"
d[4] = "red"

NOW WITH FREQUENCY ARRAY (f)

d[0] = "blue"
f[0] = 3
d[1] = "red"
d[1] = 2

It's been bugging me all day.


Gerardo Tasistro
Ranch Hand

Joined: Feb 08, 2005
Posts: 362
Well you could use a TreeMap who's key is the word. The element being stored is not a counter but a data bean (a class with certain attributes). Namely those attributes would be the amount of times the word appears and an array of document names or another tree map who's keys are now the amount of times the word appears and the value being the document name in which said word appears those many times.
Jesper de Jong
Java Cowboy
Saloon Keeper

Joined: Aug 16, 2005
Posts: 14111
    
  16

The page on autoboxing to which I gave you a link above contains practically the whole solution... didn't it help you?

Maybe it helps if you ask more specific questions instead of just "how do I do this". Show us your own code, how far did you get yourself? Where exactly do you get stuck? What exactly don't you understand?
Conrad McLaughlin
Greenhorn

Joined: Jan 31, 2006
Posts: 27
OK I got this code that works but you have to enter words on command line. You enter : Java Freq red red red blue orange (etc.), and it displays word and frequency.

-------------------------------------------------------------------------
//FREQ.JAVA
//---------
import java.util.*;
public class Freq {
public static void main(String args[]) {
Map<String, Integer> m =
new HashMap<String, Integer>();

// Initialize frequency table from command line
for (String a : args) {
Integer freq = m.get(a);
m.put(a, (freq == null ? 1 : freq + 1));
}
System.out.println(m.size() + " distinct words:");
System.out.println(m);
}
}
-------------------------------------------------------------------------

How do I change the code so that instead of String args[] from command line it gets value from a string array within the code. I tried editing the code myself with a static string array called words but this does not work. What am I doing wrong?

---------------------------------------------------------------------------
//FREQ.Java
//---------
import java.util.*;
public class Freq {
public static void main(String [] words) {
Map<String, Integer> m =
new HashMap<String, Integer>();

// Initialize frequency table from command line
for (String a : words) {
Integer freq = m.get(a);
m.put(a, (freq == null ? 1 : freq + 1));
}
System.out.println(m.size() + " distinct words:");
System.out.println(m);
}

public static String words [] = {"red", "orange", "red", "blue", "yellow", "green"};
}
----------------------------------------------------------------------------
Jesper de Jong
Java Cowboy
Saloon Keeper

Joined: Aug 16, 2005
Posts: 14111
    
  16

Take a good look at your own code.
First, look at this line:

public static void main(String [] words) {

When you start a program from the command line, the Java runtime environment calls your main method. It passes the text you entered on the command line as parameters to the main method.

Next, look at this line:

public static String words [] = {"red", "orange", "red", "blue", "yellow", "green"};

Here, you are declaring a public static member variable of your class Freq. Note that this is an entirely different variable than the String [] words argument of the main method.

You expected that if you would give those variables the same name ("words"), Java would connect them somehow. That's not how it works. The "words" in the declaration of your main method is a different variable, that hides the public static member variable which is also called "words". So, it does more or less the opposite of what you expected...

Simply change your main() back to this:

public static void main(String [] args) {

If you do this, the "words" variable that you use inside the main method refers to the public static member variable instead of the argument variable of main (which is now called "args").
[ February 02, 2006: Message edited by: Jesper de Jong ]
Conrad McLaughlin
Greenhorn

Joined: Jan 31, 2006
Posts: 27
Thanks for the help
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Frequency of elements in Array