File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes I/O and Streams and the fly likes Binary Search on a Text File Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » I/O and Streams
Bookmark "Binary Search on a Text File" Watch "Binary Search on a Text File" New topic
Author

Binary Search on a Text File

Elisha Cassidy
Greenhorn

Joined: Aug 23, 2006
Posts: 9
Hi,

Can someone please help me. I have a really large text file that contains a list of hash values. I wanted to do a binary search on this to check if a particular hash value is present in the file and to return true if it is and false if not. Does anybody know of any good sites/tutorials that can help me build one. i've never covered this area before and am totally lost. i have read the java.sun site but am still confused.

any help would be greatly appreciated,

thanks,

Elisha
Joe Ess
Bartender

Joined: Oct 29, 2001
Posts: 8895
    
    8

Welcome to the JavaRanch, Elisha.
On your way in you may have missed our naming policy. In short, we ask that everyone here use their first and last name. It keeps things friendly. You can change your name here.
As for your question, we get a lot of people asking us to do your homework for them. While we are willing to help, we'll still let you do the heavy lifting.
You need a special functionality to perform your task. A good place to start would be the IO chapter of the Java Tutorial to see what Java has to offer.


"blabbing like a narcissistic fool with a superiority complex" ~ N.A.
[How To Ask Questions On JavaRanch]
Elisha Cassidy
Greenhorn

Joined: Aug 23, 2006
Posts: 9
Hi Joe,

Thanks for your reply. Yeah i missed the whole naming policy.

I have tried some code but am still confused about how to initially jump to the middle of the file. Do i need to use an array. My file is extremely big in that it contains ten million entries, therefore to big to store in memory. At the moment i am using a test file that only contains 9 entries just to get it to work. Can you please help me?

Thanks in advanced,

ELisha,

Here's the code i tried:

Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41800
    
  62
At a quick glance I don't see where the array "a" is initialized.


Ping & DNS - my free Android networking tools app
Elisha Cassidy
Greenhorn

Joined: Aug 23, 2006
Posts: 9
Hi,

Thanks for the reply. i have initialised the array to equal the number of lines in the file. . this was done outside the if statement inside the do loop and this array is passed to the binarySearch method that takes in this array and a string. the problem is my method looks like it is tring to compare a long value which is the point its at in the file to a string. i dont know how to get the string at the mid point of the file and compare it to my test string.

thanks again for the help,

Elisha.
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41800
    
  62
You need to keep track of the number of lines read. Divide that by 2, and you have the index of the middle element.

Of course, if you don't know how many lines you will read, you won't know how to big to make the array... But what is the point of coming up with a solution for the test problem if you already know it won't work with the real problem?
Elisha Cassidy
Greenhorn

Joined: Aug 23, 2006
Posts: 9
Hi,

I don't get what you are saying. I have initialised the size of the array to equal the number of lines in the file and that works fine. I can also get it to print out the number of the midle point in the file i.e. if last is 8 then the middle is the first + last divided by 2 which results in 4. The problem I am having is that I don't know how to extract the string that is located at the midpoint i.e. 4 and test it to see if it is the string I am looking for. Basically my file contains nine lines with one word on each line and i want to check if a word is in this file

Any help would be greatly appreciated.

Elisha
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41800
    
  62
You can access the 4th element of the array by "a[4-1]", i.e. "a[3]". "-1" because array indices start at 0, not 1. But the binarySearch method already does that, so maybe you're asking something else?

You could also have a look at the java.util.Arrays class, which has methods that implement binary searches on arrays.

What I meant by "it won't work on the real problem" is that you said before that the file is too big to hold in memory, so a search technique that relies on storing the complete file in an array won't help with that.
Elisha Cassidy
Greenhorn

Joined: Aug 23, 2006
Posts: 9
it ok, i actually got it to work on a text file that holds over well over 10 million enties,

Thanks for your replys

Elisha
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Binary Search on a Text File