This week's book giveaway is in the Servlets forum.
We're giving away four copies of Murach's Java Servlets and JSP and have Joel Murach on-line!
See this thread for details.
The moose likes Java in General and the fly likes Matrix form of files and words Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Murach's Java Servlets and JSP this week in the Servlets forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "Matrix form of files and words" Watch "Matrix form of files and words" New topic
Author

Matrix form of files and words

K
Greenhorn

Joined: Aug 03, 2006
Posts: 2
hi

I want to create a matrix form of files and words in each file i.e., like
files as rows, words as columns.

matrix [ file ][ word ] = (frequency of word in the file)

Is there any logic to do this ?
I tried to create with arrays and hashmaps...
Any help would be highly appreciated.
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12759
    
    5
For parsing text files into words - see the java.io.StreamTokenizer class

In order for the columns to make any sense, it seems to me that you need to start with a dictionary of words to be recognized - all other words to be ignored.

A Hashmap can be used to look up the column number corresponding to a word.

Bill
K
Greenhorn

Joined: Aug 03, 2006
Posts: 2
Thanks for your reply Bill.

I already have the filtered lists of words and their frequencies of each file.
But I'm confused in representing them as a matrix form.

If you can give me a sample code, that will be great.
Do you have any idea about vector space model ?

Thanks a lot.
Krish
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12759
    
    5
When you say "Matrix form" - what do you mean? Do you have to use some specific matrix math package or are you just looking for a convenient display?
Perhaps this tutorial on arrays of arrays will help.
Bill
Stan James
(instanceof Sidekick)
Ranch Hand

Joined: Jan 29, 2003
Posts: 8791
Use REXX Associative Arrays are pretty cool.

In Java I'd look into a map keyed by filename holding maps keyed by words. Or the other way around.


A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
 
wood burning stoves
 
subject: Matrix form of files and words
 
Similar Threads
going nuts trying to reference images on server
Advice please
Using FOP for Arabic Output... SOS
June Newsletter Puzzle
How to link Hashmap and Arraylist