In the earlier posting i got help as to how to process a logic and write output to text files, Thanks a lot the help. Question i have in mind is i need to now iterate through these two text files and get what common in both these files. Is there an API i can use? or do you suggest me to call Unix compare command via JAVA code?
I have come up an Algorithm too, please suggest would it work?
I am in the process of implementing it. I wll keep the thread updated.
-Aditya
Hardik Trivedi
Ranch Hand
Joined: Jan 30, 2010
Posts: 252
posted
0
Hi,
Dear there is very big mistake either side of you or repliers....
I think you want to find such list of words which are common in both files...RIGHT?
Then let me tell you there is no specific method or api for that.
Use your own algorithm.
fetch word and compare that with all other words in second file if it match anywhere put in the array of string and finally return that array.....
Taking a quick look at this, it seems rather unsophisticated. For example, it makes no allowance for missing or extra lines in one of the files. So if the first line of one file is missing, then *all* subsequent lines will be reported as different, even though they may be identical.
This is generally the realm of the "diff" command, which is available on all Unix/Linux boxes (as opposed to "compare", which is not).
I wanted to post an update with regard to my question. So the earlier code that goes the way as below does not work perfectly in all the cases. What it does if value in any lines are same it displays that string, which i not the actual output.
For example if the content of the two text files are as below, the actual output should be
"
A
friendly
place
for
Java
greenhorns",
where as the out put i get is "place". What changes should i made to the existing code? Comments are appreciated.
Why should the output be what you show above? What are you trying to accomplish?
In any case: you're printing the line if the two lines, one from each file, are the same. I'm not even sure why you're getting the line that says "place", since text2.txt has a lot of leading spaces.
Sorry for the inconvenience, i want to consider the white space too and grep all the lines that are present in both the file irrespective of what line number they appear.
Aditya Sirohi wrote:i want to consider the white space too and grep all the lines that are present in both the file irrespective of what line number they appear.
This is a perfect example of why specs are so important. We have gone from
"get what is common to both files"
to an example (which by itself is fine, but is incomplete)
to "print the lines which are common in both the text files"
to " it should print the lines common in both the text files"
to " i want to consider the white space too and grep all the lines that are present in both the file irrespective of what line number they appear."
All these statement could mean slightly different things to different people. What does "consider the white space too" mean exactly? if file 'a' has "fred " and file 'b' has "fred", is that a match or not?
If I am interpreting what you want correctly, and I am not sure I am, I think what you need to do is read a single line from file 'a', and see if it's in file 'b', using whatever restriction you need regarding white space.
The, read the next line of file 'a' and compare against every line again.
You can possibly make your program smarter by checking to make sure that you read from the shorter file, that you don't re-test a line if you've already looked for it (unless you need to know for some reason), and perhaps by using the right data structures to store some info.
But the first thing I would do is nail down EXACTLY what you want in unambiguous terms.
Never ascribe to malice that which can be adequately explained by stupidity.
I want to apologize for not being clear with my question.
All these statement could mean slightly different things to different people. What does "consider the white space too" mean exactly? if file 'a' has "fred " and file 'b' has "fred", is that a match or not?
Yes, if file 'a' has word fred and file 'b' has word fred then its a match.
I tried to write a piece of code, but it did not work. Comments are appreciated.
As both Fred and I hinted, if order is not important, then simply looping over the lines isn't going to work--you need to be able to check all previous lines of the first file for each line in the second file. Can you think of some ways you might approach that?
I know what the code should be like, but i am finding it harder to implement it.
The pseudo code i have in mind is:
1. Read file 'a' line by line.
2. for each line in file 'a', check whether is present in file 'b', if its there then print the line.
I think that should solve my main problem. If i could get to know what Constructor and method i can use or a skeleton solution to the problem, i can work from there on.
In any case, I don't need to give you a skeleton--you just defined the skeleton by writing out the steps you need to take. So what's next? What's the easiest way you can think of to implement what you just described?
my approach would be to do a simple hashing of every line in A and B and store them in an arraylist as strings
then use the contains method to check existence.
however then the complexity of a hit and a miss comes into picture and thus optimizations (as usual) complicate a simple issue...
I had been working whole day today and i made some progress, i can now store all the line of file 'a' into an array. Now i am trying to iterate over each element in the array and check if its present in file 'b'. I wanted to share the code i have till now. My code will look like novice, expert comments are appreciated.
Hello,
I have stored the content of the two files into an array but when i try to compare them, i get a null pointer exception on line :- if(arrayLines1[i].contains(arrayLines2[j]))
You have the *capability* of reading in a thousand lines, but the files don't necessarily *contain* a thousand lines. So you don't want to check the length of the array--you want to check against how many lines the file actually has.
I cannot get the common strings in two array i have created in the above code. I have tried to do this till now. But i dont get any output. I get an IO exception. Am i doing any thing wrong?
Thanks To everyone, Java Ranch is a awesome place to learn. I would say i am a novice in programming, but when i get some feedback i get motivated to solve the problem. So i am posting the code below which give all the lines common in the two files. I still get the exception for line 13 and 63. Comments are appreciated.
I got it i had to do for (int i = 0 ; i < arrayLines1.length ; i++) instead of for (int i = 0 ; i <= arrayLines1.length ; i++) in the displayRecords().