| Author |
Matrix form of files and words
|
K
Greenhorn
Joined: Aug 03, 2006
Posts: 2
|
|
hi I want to create a matrix form of files and words in each file i.e., like files as rows, words as columns. matrix [ file ][ word ] = (frequency of word in the file) Is there any logic to do this ? I tried to create with arrays and hashmaps... Any help would be highly appreciated.
|
 |
William Brogden
Author and all-around good cowpoke
Rancher
Joined: Mar 22, 2000
Posts: 12271
|
|
For parsing text files into words - see the java.io.StreamTokenizer class In order for the columns to make any sense, it seems to me that you need to start with a dictionary of words to be recognized - all other words to be ignored. A Hashmap can be used to look up the column number corresponding to a word. Bill
|
Java Resources at www.wbrogden.com
|
 |
K
Greenhorn
Joined: Aug 03, 2006
Posts: 2
|
|
Thanks for your reply Bill. I already have the filtered lists of words and their frequencies of each file. But I'm confused in representing them as a matrix form. If you can give me a sample code, that will be great. Do you have any idea about vector space model ? Thanks a lot. Krish
|
 |
William Brogden
Author and all-around good cowpoke
Rancher
Joined: Mar 22, 2000
Posts: 12271
|
|
When you say "Matrix form" - what do you mean? Do you have to use some specific matrix math package or are you just looking for a convenient display? Perhaps this tutorial on arrays of arrays will help. Bill
|
 |
Stan James
(instanceof Sidekick)
Ranch Hand
Joined: Jan 29, 2003
Posts: 8791
|
|
Use REXX Associative Arrays are pretty cool. In Java I'd look into a map keyed by filename holding maps keyed by words. Or the other way around.
|
A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
|
 |
 |
|
|
subject: Matrix form of files and words
|
|
|