Win a copy of Design for the Mind this week in the Design forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Inserting several text files into individual hash sets

 
Fazz
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I've managed to feed in a text file (A) into my program and store it in a hashSet. However, i need to feed in another two text files (A and C) and store them also in HashSets. The purpose being, i need to compare A and B to check if they conain any of the same words, than compare Band C and again check if the two files have any of the same words. This is supposed to be an authorship attribution program, so C will than be attributed to the file that it was most similar to.
Each file should be stored in its own hashset.

How can i change the code to enable it to feed in two other text files and store them?


public class FileIn {
// assigns BufferedReader an instance name to be used in the code
BufferedReader in;
HashSet set;


public FileIn() {

try {
// make sure there is a text file in the java directory where the java
// code is saved
in = new BufferedReader(new FileReader("C:\\Program Files\\Java\\example.txt"));
set = new HashSet();

int Len = 1;
while(Len>0) {
String line = in.readLine();
try {
Len = line.length();
System.out.println(line);
set.add(line);
} catch(NullPointerException npe){
Len = 0; //no more file to read
}
}


public static void main(String args[]) {
//this gets executed when the java file is run
//this then (starting from the top) does what u tell it
FileIn newFile = new FileIn();

}

}

 
Jan Groth
Ranch Hand
Posts: 456
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hi farah,

what you need to do is to start using parameters in your methods / class.

how does this main method (in pseudo code) look to you?



and something like this:



also notice that i'm using the interface (Set) instead of the implementation (HashSet) in all declarations. This gives more flexibility and interchangebility.

please feel free to ask any further questions,

jan
 
Fazz
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Jan,

Thank you for your advice. im pretty new to java so its taking me a while to get to grips with things. The psuedo code does make sense to me but the java code not so much.


private void readFileIntoSet(String filePath, Set setToReadIn) {...}

is this necessary? could i jast add to the code i already have:

in = new BufferedReader(new FileReader("C:\\Program Files\\Java\\example.txt")); //could i just add an extra two lines of this for the other text files (in2, in3 etc)?


private double equalityBToC (Set firstSet, Set secondSet) {...}
//would you be able to elabortae on this a little?


Thank you , farah
 
Jan Groth
Ranch Hand
Posts: 456
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hi farah,

let's focus on the first issue.

what i am proposing to you is to generalize your code which reads a file into a hashset. what you have right now is a 15-liner which takes a file at a fixed location and fills one special set-object with it.

if you want to change your code that it reads a second file, you had to double almost all your code, if you want to read 2 files, you triple it and so on.

but - and this is my point - if you put the whole logic into one method (and not only the single line you pointed in your answer), you'll be able to reuse your code as many time as you want.

this is your code. notice the (hopefully bold) parts. they are constants, which is not good.



your first goal should be to change the method in a way that it takes the former constants as parameters.



makes sense? otherwise just continue posting your questions,

:-)

jan
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic