| Author |
Lucene Capacity and speed
|
William Brogden
Author and all-around good cowpoke
Rancher
Joined: Mar 22, 2000
Posts: 11862
|
|
Years ago I wrote full text search engines so I am interested in catchingup with the state of the art. My typical large applications were tens of megabytes of legal text, so I am wondering how large the largest Lucene database you ever worked with is, how long it took to index, and how long searches take. Bill
|
Java Resources at www.wbrogden.com
|
 |
Erik Hatcher
Author
Ranch Hand
Joined: Jun 11, 2002
Posts: 111
|
|
Originally posted by William Brogden: Years ago I wrote full text search engines so I am interested in catchingup with the state of the art. My typical large applications were tens of megabytes of legal text, so I am wondering how large the largest Lucene database you ever worked with is, how long it took to index, and how long searches take.
Tens of megabytes of text in Lucene is quite common-place. My current project, building a search application around the Rossetti Archive (not online yet), is indexing a couple hundred megabytes of XML. the resultant indexes (I'm building four indexes from it, as I slice the data in numerous ways) is around the same size, as I end up storing information into the index rather than bouncing back to the XML files at display-time. There are much more massive usages of Lucene out there than I have experience with, for example Nutch is designed as a full web crawler and has scaled to hundreds of millions of web pages in production already. Search speed is dependent on the complexity of the query, but I see speeds on the order of magnitude of mlliseconds.
|
Co-author of Lucene in Action
|
 |
William Brogden
Author and all-around good cowpoke
Rancher
Joined: Mar 22, 2000
Posts: 11862
|
|
Wow - Nutch (www.nutch.org) sounds like a very ambitious project.
|
 |
 |
|
|
subject: Lucene Capacity and speed
|
|
|