IntelliJ Java IDE
The moose likes Other Open Source Projects and the fly likes Lucene Capacity and speed Big Moose Saloon
  Search | Java FAQ | Recent Topics
Register / Login
JavaRanch » Java Forums » Products » Other Open Source Projects
Reply Bookmark "Lucene Capacity and speed" Watch "Lucene Capacity and speed" New topic
Author

Lucene Capacity and speed

William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 11862
Years ago I wrote full text search engines so I am interested in catchingup with the state of the art. My typical large applications were tens of megabytes of legal text, so I am wondering how large the largest Lucene database you ever worked with is, how long it took to index, and how long searches take.

Bill


Java Resources at www.wbrogden.com
Erik Hatcher
Author
Ranch Hand

Joined: Jun 11, 2002
Posts: 111
Originally posted by William Brogden:
Years ago I wrote full text search engines so I am interested in catchingup with the state of the art. My typical large applications were tens of megabytes of legal text, so I am wondering how large the largest Lucene database you ever worked with is, how long it took to index, and how long searches take.


Tens of megabytes of text in Lucene is quite common-place. My current project, building a search application around the Rossetti Archive (not online yet), is indexing a couple hundred megabytes of XML. the resultant indexes (I'm building four indexes from it, as I slice the data in numerous ways) is around the same size, as I end up storing information into the index rather than bouncing back to the XML files at display-time.

There are much more massive usages of Lucene out there than I have experience with, for example Nutch is designed as a full web crawler and has scaled to hundreds of millions of web pages in production already.

Search speed is dependent on the complexity of the query, but I see speeds on the order of magnitude of mlliseconds.


Co-author of Lucene in Action
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 11862
Wow - Nutch (www.nutch.org) sounds like a very ambitious project.
 
 
subject: Lucene Capacity and speed
 
Threads others viewed
How to provide willdcard search
Ordering Search Results
used lucene?
Opening a File with the appropriate program with highlighted text
Journal Article - The Lucene Search Engine - Adding search to your applications
developer file tools