aspose file tools*
The moose likes Other Open Source Projects and the fly likes Lucene in Action Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Products » Other Open Source Projects
Bookmark "Lucene in Action" Watch "Lucene in Action" New topic
Author

Lucene in Action

Erik Hatcher
Author
Ranch Hand

Joined: Jun 11, 2002
Posts: 111
Today is the start of our promotion in this forum for "Lucene in Action". I'm eagerly awaiting the flood of great questions from the Ranch members

Just to kick things off, I've spent the last few weeks building a "search inside" the book and companion blog at http://www.lucenebook.com which uses lots of web tier and Lucene trickery to combine a blojsom-based blog and a Tapestry-based search page with two Lucene indexes (one for blog content, one for book content). The site is evolving, so any feedback/suggestions you have on it are most welcome.


Co-author of Lucene in Action
Mary Wallace
Ranch Hand

Joined: Aug 25, 2003
Posts: 138
Hello Authors,

Can you please hoe much work from a developers end point to install Lucene adn amke it work.
Otis Gospodnetic
Author
Greenhorn

Joined: Dec 30, 2004
Posts: 23
Hello Mary,

Lucene comes as a single Jar file. Thus, you have to put it in your CLASSPATH and then start using its simple API. Chapter 1, free and available from https://secure.manning.com/catalog/view.php?book=hatcher2&item=chapters , includes the most basic indexing and searching code examples.

Otis


Lucene in Action: http://www.manning.com/lucene
Alexandru Popescu
Ranch Hand

Joined: Jul 12, 2004
Posts: 995
Hi Erik!

The site looks amazing. What would be amazing would be to have access to the sourcecode just to have a true image about how you build it.

10x
--
./pope


blog - InfoQ.com
Stan James
(instanceof Sidekick)
Ranch Hand

Joined: Jan 29, 2003
Posts: 8791
I'm tickled with Lucene on my Wiki, but I never quite figgered out how to re-index one file when it changes. Lucene has a method to remove a file by index. Do I have to search through Lucene's catalog of files to find mine to get the index? Any simple examples?


A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
Otis Gospodnetic
Author
Greenhorn

Joined: Dec 30, 2004
Posts: 23
Hello,

Lucene is really a library for full-text indexing and searching, and not an application that knows how to index Wikis, files, databases, or the Web. So you are probably either using a Lucene demo to index some local files, or you wrote your own indexing application on top of Lucene.
What you should probably do is make sure that your Lucene Documents contain a Field with the file path in it. This Field should be indexed and not tokenized (Field.Keyword). Then, when you detect that a file has changed you need to remove it from the Lucene index, and re-add it. To remove it from the index you'll do something like this:

Term term = new Term("pathFieldNameHere", "yourFilesPathHere");
IndexReader reader = IndexReader.open("/path/to/your/index");
int deletedCount = reader.delete(term);

This should always delete only 1 Document - the one that matches your file - because each file has a unique path.

After this you'll have to readd your Document the usual way - via IndexWriter's addDocument(Document) method.

If you are doing all this by just using the Lucene demo, you are not really using Lucene the right way, and you are veeeery far from using it fully. To get going I suggest you look at the 2 sample chapters available at Lucene in Action's site - http://www.lucenebook.com/ . You can also download the free sample code from Manning's site or just get the p/ebook (print version gets you the ebook version for free).

Otis
Karthik prabakar
Greenhorn

Joined: Mar 24, 2004
Posts: 1
HI

Authors i need to know what is lucene and for what it can be used can you give me a link from whr i can get to know abt it ,all i know it is used as a search engine right!!!
Otis Gospodnetic
Author
Greenhorn

Joined: Dec 30, 2004
Posts: 23
Hello,

You will find a lot of Lucene resources (articles, tutorials, etc.) at
http://wiki.apache.org/jakarta-lucene/IntroductionToLucene and at http://www.java201.com/resources/browse/38-all.html . You could also grab the free chapter from Lucene in Action, chapter 1. It will explain what Lucene is and how it is used. Chapter 1 can be dowloaded from http://www.manning-source.com/books/hatcher2/hatcher2_chp1.pdf

Otis
Karthik Guru
Ranch Hand

Joined: Mar 06, 2001
Posts: 1209
Hello Otis,

Again couple of basic questions. While developing a typical web application that comprises of basic crud on entities, workflows and few search facilities on the existing entities, would I be using Lucene?.
If i were to search for something stored in a database, i would use the database indexing to fetch the data for me in an efficient manner.

But say for eg am developing a software for a recruiter and i want to pull out resumes that say has the skill 'Lucene' mentioned in it , I should probably use Lucene?.

Please give us instances where you used lucene in projects. That will give us some idea of how to put lucene to work.

thanks.
Otis Gospodnetic
Author
Greenhorn

Joined: Dec 30, 2004
Posts: 23
Hello Karthik,

Yes, if you wanted to search for a word inside a large chunk of text (e.g "lucene" in a collection of resumes), you will want to use Lucene instead of LIKE '%lucene%' . This is just the most basic example, really. Lucene's site contains a page with currently available search syntax.

You can look at http://www.simpy.com/ for an example of a Lucene-powered site for social bookmarking. If you are building something that includes resumes, then you may be interested in http://www.indeed.com/, a job site for which I helped build the Lucene prototype - the site uses Lucene for resume searches.
Finally, the Lucene in Action site, http://www.lucenebook.com/, makes very nice use of Lucene. It's meant to be used in combination with either ebook or pbook, but you can try it out even if you don't have the book, just to get a feel for what Lucene can do.

Otis
Roger Thornhill
Author
Greenhorn

Joined: May 15, 2002
Posts: 25
Hi -

I was wondering if the authors had any information on the extent to which Lucene has been successfully deployed in large non-commerical or commerical applications. Pointers to specific sites would be very much appreciated, but even anecdotal references / testimonials would be helpful.

Thanks very much!
[ January 05, 2005: Message edited by: Greg Barish ]
Mehdi Chaouachi
Ranch Hand

Joined: Jul 02, 2003
Posts: 87
what is Lucene anyway?


Mehdi Chaouachi<p>Sun Certified Java Programer (1.4)<br />Sun Certified Web Component Developer (1.4).<br />Sun Certified Mobile Application Developer.
Otis Gospodnetic
Author
Greenhorn

Joined: Dec 30, 2004
Posts: 23
Greg,

Here are some mentions of big Lucene deployments: http://www.lucenebook.com/search?query=mayo (look at the context around the highlighted keyword). You can get a longer list at http://wiki.apache.org/jakarta-lucene/PoweredBy . My last big use of Lucene is in Simpy - http://www.simpy.com/ , where I have thousands of small Lucene indices.

Otis
Erik Hatcher
Author
Ranch Hand

Joined: Jun 11, 2002
Posts: 111
Originally posted by Ali Pope:
The site looks amazing. What would be amazing would be to have access to the sourcecode just to have a true image about how you build it.


Thanks! I plan on writing up case study material, complete with relevant code, on how the site was built and how it will continue to evolve.
Alexandru Popescu
Ranch Hand

Joined: Jul 12, 2004
Posts: 995
Just wait to see it.

--
./pope
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Lucene in Action