This week's giveaway is in the Android forum.
We're giving away four copies of Android Security Essentials Live Lessons and have Godfrey Nolan on-line!
See this thread for details.
The moose likes I/O and Streams and the fly likes Fast indexing / searching of a text file Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Android Security Essentials Live Lessons this week in the Android forum!
JavaRanch » Java Forums » Java » I/O and Streams
Bookmark "Fast indexing / searching of a text file" Watch "Fast indexing / searching of a text file" New topic
Author

Fast indexing / searching of a text file

jay vas
Ranch Hand

Joined: Aug 30, 2005
Posts: 407
Hi : I am writing programs that read data from massive text files.
What is the best way to do this in java ? Is indexing a possibility and if so what is the way to jump from one index to another ? Thanks, Jay
Nitesh Kant
Bartender

Joined: Feb 25, 2007
Posts: 1638

Yeah definetly indexing is a way to reduce the amount of time to fetch a record.
I am not sure what are your requirements but Apache lucene is a free text search engine that you may be interested in.
This article gives an insight into how to use a RandomAccessFile to build a small database. Although, it may not fit perfectly into your requirement but may give you a headstart about indexes and accessing records using indexes.


apigee, a better way to API!
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41150
    
  45
Reading data from a file is something different than searching an index of the file, because the index typically does not contain the full text of the indexed documents. So whether an index would help depends on what exactly you need to do with the text.

I don't understand what you mean by "jump from one index to another" - random access of the file contents?


Ping & DNS - my free Android networking tools app
Nitesh Kant
Bartender

Joined: Feb 25, 2007
Posts: 1638

Ulf: the index typically does not contain the full text of the indexed documents.

True, but the index will typically give me the record pointer, isnt it?
So, if i have indexed a text file to give me record pointers & record length for records containing a particular value for the indexed field, i can quickly retrieve the record from the file. isn't?

Ulf:I don't understand what you mean by "jump from one index to another" - random access of the file contents?

I assumed this! Its worth while getting this confirmed.
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41150
    
  45
the index will typically give me the record pointer, isnt it? So, if i have indexed a text file to give me record pointers & record length for records containing a particular value for the indexed field, i can quickly retrieve the record from the file. isn't?


It is possible to to create an index like that. But that may or may not address the underlying problem. In particular, we don't know if there's a notion of structure or records within the files. That's why I asked the original poster for clarification what he's trying to do.
Nitesh Kant
Bartender

Joined: Feb 25, 2007
Posts: 1638

Ulf:That's why I asked the original poster for clarification what he's trying to do.

Oh yeah absolutely, your question was perfectly valid. I was just confirming my understanding
 
It is sorta covered in the JavaRanch Style Guide.
 
subject: Fast indexing / searching of a text file
 
Similar Threads
Find the previous element in a List
Lucene
indexing
solr implementation
Lucene and PDF