wood burning stoves 2.0*
The moose likes I/O and Streams and the fly likes Fast indexing / searching of a text file Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of OCM Java EE 6 Enterprise Architect Exam Guide this week in the OCMJEA forum!
JavaRanch » Java Forums » Java » I/O and Streams
Bookmark "Fast indexing / searching of a text file" Watch "Fast indexing / searching of a text file" New topic
Author

Fast indexing / searching of a text file

jay vas
Ranch Hand

Joined: Aug 30, 2005
Posts: 407
Hi : I am writing programs that read data from massive text files.
What is the best way to do this in java ? Is indexing a possibility and if so what is the way to jump from one index to another ? Thanks, Jay
Nitesh Kant
Bartender

Joined: Feb 25, 2007
Posts: 1638

Yeah definetly indexing is a way to reduce the amount of time to fetch a record.
I am not sure what are your requirements but Apache lucene is a free text search engine that you may be interested in.
This article gives an insight into how to use a RandomAccessFile to build a small database. Although, it may not fit perfectly into your requirement but may give you a headstart about indexes and accessing records using indexes.


apigee, a better way to API!
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41634
    
  55
Reading data from a file is something different than searching an index of the file, because the index typically does not contain the full text of the indexed documents. So whether an index would help depends on what exactly you need to do with the text.

I don't understand what you mean by "jump from one index to another" - random access of the file contents?


Ping & DNS - my free Android networking tools app
Nitesh Kant
Bartender

Joined: Feb 25, 2007
Posts: 1638

Ulf: the index typically does not contain the full text of the indexed documents.

True, but the index will typically give me the record pointer, isnt it?
So, if i have indexed a text file to give me record pointers & record length for records containing a particular value for the indexed field, i can quickly retrieve the record from the file. isn't?

Ulf:I don't understand what you mean by "jump from one index to another" - random access of the file contents?

I assumed this! Its worth while getting this confirmed.
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41634
    
  55
the index will typically give me the record pointer, isnt it? So, if i have indexed a text file to give me record pointers & record length for records containing a particular value for the indexed field, i can quickly retrieve the record from the file. isn't?


It is possible to to create an index like that. But that may or may not address the underlying problem. In particular, we don't know if there's a notion of structure or records within the files. That's why I asked the original poster for clarification what he's trying to do.
Nitesh Kant
Bartender

Joined: Feb 25, 2007
Posts: 1638

Ulf:That's why I asked the original poster for clarification what he's trying to do.

Oh yeah absolutely, your question was perfectly valid. I was just confirming my understanding
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Fast indexing / searching of a text file