aspose file tools*
The moose likes Java in General and the fly likes Inserting several text files into individual hash sets Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "Inserting several text files into individual hash sets" Watch "Inserting several text files into individual hash sets" New topic
Author

Inserting several text files into individual hash sets

Fazz
Greenhorn

Joined: Mar 16, 2006
Posts: 9
I've managed to feed in a text file (A) into my program and store it in a hashSet. However, i need to feed in another two text files (A and C) and store them also in HashSets. The purpose being, i need to compare A and B to check if they conain any of the same words, than compare Band C and again check if the two files have any of the same words. This is supposed to be an authorship attribution program, so C will than be attributed to the file that it was most similar to.
Each file should be stored in its own hashset.

How can i change the code to enable it to feed in two other text files and store them?


public class FileIn {
// assigns BufferedReader an instance name to be used in the code
BufferedReader in;
HashSet set;


public FileIn() {

try {
// make sure there is a text file in the java directory where the java
// code is saved
in = new BufferedReader(new FileReader("C:\\Program Files\\Java\\example.txt"));
set = new HashSet();

int Len = 1;
while(Len>0) {
String line = in.readLine();
try {
Len = line.length();
System.out.println(line);
set.add(line);
} catch(NullPointerException npe){
Len = 0; //no more file to read
}
}


public static void main(String args[]) {
//this gets executed when the java file is run
//this then (starting from the top) does what u tell it
FileIn newFile = new FileIn();

}

}

Jan Groth
Ranch Hand

Joined: Feb 03, 2004
Posts: 456
hi farah,

what you need to do is to start using parameters in your methods / class.

how does this main method (in pseudo code) look to you?



and something like this:



also notice that i'm using the interface (Set) instead of the implementation (HashSet) in all declarations. This gives more flexibility and interchangebility.

please feel free to ask any further questions,

jan
Fazz
Greenhorn

Joined: Mar 16, 2006
Posts: 9
Jan,

Thank you for your advice. im pretty new to java so its taking me a while to get to grips with things. The psuedo code does make sense to me but the java code not so much.


private void readFileIntoSet(String filePath, Set setToReadIn) {...}

is this necessary? could i jast add to the code i already have:

in = new BufferedReader(new FileReader("C:\\Program Files\\Java\\example.txt")); //could i just add an extra two lines of this for the other text files (in2, in3 etc)?


private double equalityBToC (Set firstSet, Set secondSet) {...}
//would you be able to elabortae on this a little?


Thank you , farah
Jan Groth
Ranch Hand

Joined: Feb 03, 2004
Posts: 456
hi farah,

let's focus on the first issue.

what i am proposing to you is to generalize your code which reads a file into a hashset. what you have right now is a 15-liner which takes a file at a fixed location and fills one special set-object with it.

if you want to change your code that it reads a second file, you had to double almost all your code, if you want to read 2 files, you triple it and so on.

but - and this is my point - if you put the whole logic into one method (and not only the single line you pointed in your answer), you'll be able to reuse your code as many time as you want.

this is your code. notice the (hopefully bold) parts. they are constants, which is not good.



your first goal should be to change the method in a way that it takes the former constants as parameters.



makes sense? otherwise just continue posting your questions,

:-)

jan
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Inserting several text files into individual hash sets