Today is the start of our promotion in this forum for "Lucene in Action". I'm eagerly awaiting the flood of great questions from the Ranch members
Just to kick things off, I've spent the last few weeks building a "search inside" the book and companion blog at http://www.lucenebook.com which uses lots of web tier and Lucene trickery to combine a blojsom-based blog and a Tapestry-based search page with two Lucene indexes (one for blog content, one for book content). The site is evolving, so any feedback/suggestions you have on it are most welcome.
Can you please hoe much work from a developers end point to install Lucene adn amke it work.
Otis Gospodnetic
Author
Greenhorn
Joined: Dec 30, 2004
Posts: 23
posted
0
Hello Mary,
Lucene comes as a single Jar file. Thus, you have to put it in your CLASSPATH and then start using its simple API. Chapter 1, free and available from https://secure.manning.com/catalog/view.php?book=hatcher2&item=chapters , includes the most basic indexing and searching code examples.
I'm tickled with Lucene on my Wiki, but I never quite figgered out how to re-index one file when it changes. Lucene has a method to remove a file by index. Do I have to search through Lucene's catalog of files to find mine to get the index? Any simple examples?
A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
Otis Gospodnetic
Author
Greenhorn
Joined: Dec 30, 2004
Posts: 23
posted
0
Hello,
Lucene is really a library for full-text indexing and searching, and not an application that knows how to index Wikis, files, databases, or the Web. So you are probably either using a Lucene demo to index some local files, or you wrote your own indexing application on top of Lucene. What you should probably do is make sure that your Lucene Documents contain a Field with the file path in it. This Field should be indexed and not tokenized (Field.Keyword). Then, when you detect that a file has changed you need to remove it from the Lucene index, and re-add it. To remove it from the index you'll do something like this:
Term term = new Term("pathFieldNameHere", "yourFilesPathHere"); IndexReader reader = IndexReader.open("/path/to/your/index"); int deletedCount = reader.delete(term);
This should always delete only 1 Document - the one that matches your file - because each file has a unique path.
After this you'll have to readd your Document the usual way - via IndexWriter's addDocument(Document) method.
If you are doing all this by just using the Lucene demo, you are not really using Lucene the right way, and you are veeeery far from using it fully. To get going I suggest you look at the 2 sample chapters available at Lucene in Action's site - http://www.lucenebook.com/ . You can also download the free sample code from Manning's site or just get the p/ebook (print version gets you the ebook version for free).
Otis
Karthik prabakar
Greenhorn
Joined: Mar 24, 2004
Posts: 1
posted
0
HI
Authors i need to know what is lucene and for what it can be used can you give me a link from whr i can get to know abt it ,all i know it is used as a search engine right!!!
Again couple of basic questions. While developing a typical web application that comprises of basic crud on entities, workflows and few search facilities on the existing entities, would I be using Lucene?. If i were to search for something stored in a database, i would use the database indexing to fetch the data for me in an efficient manner.
But say for eg am developing a software for a recruiter and i want to pull out resumes that say has the skill 'Lucene' mentioned in it , I should probably use Lucene?.
Please give us instances where you used lucene in projects. That will give us some idea of how to put lucene to work.
thanks.
Otis Gospodnetic
Author
Greenhorn
Joined: Dec 30, 2004
Posts: 23
posted
0
Hello Karthik,
Yes, if you wanted to search for a word inside a large chunk of text (e.g "lucene" in a collection of resumes), you will want to use Lucene instead of LIKE '%lucene%' . This is just the most basic example, really. Lucene's site contains a page with currently available search syntax.
You can look at http://www.simpy.com/ for an example of a Lucene-powered site for social bookmarking. If you are building something that includes resumes, then you may be interested in http://www.indeed.com/, a job site for which I helped build the Lucene prototype - the site uses Lucene for resume searches. Finally, the Lucene in Action site, http://www.lucenebook.com/, makes very nice use of Lucene. It's meant to be used in combination with either ebook or pbook, but you can try it out even if you don't have the book, just to get a feel for what Lucene can do.
Otis
Roger Thornhill
Author
Greenhorn
Joined: May 15, 2002
Posts: 25
posted
0
Hi -
I was wondering if the authors had any information on the extent to which Lucene has been successfully deployed in large non-commerical or commerical applications. Pointers to specific sites would be very much appreciated, but even anecdotal references / testimonials would be helpful.
Thanks very much! [ January 05, 2005: Message edited by: Greg Barish ]
Mehdi Chaouachi
Ranch Hand
Joined: Jul 02, 2003
Posts: 87
posted
0
what is Lucene anyway?
Mehdi Chaouachi<p>Sun Certified Java Programer (1.4)<br />Sun Certified Web Component Developer (1.4).<br />Sun Certified Mobile Application Developer.
Originally posted by Ali Pope: The site looks amazing. What would be amazing would be to have access to the sourcecode just to have a true image about how you build it.
Thanks! I plan on writing up case study material, complete with relevant code, on how the site was built and how it will continue to evolve.