The moose likes Other Open Source Projects and the fly likes Lucene - word list usage during index building Big Moose Saloon
  Search | Java FAQ | Recent Topics
Register / Login
JavaRanch » Java Forums » Products » Other Open Source Projects
Reply Bookmark "Lucene - word list usage during index building" Watch "Lucene - word list usage during index building" New topic
Author

Lucene - word list usage during index building

Bill Bug
Greenhorn

Joined: Feb 06, 2003
Posts: 5
Hi Lucene Book Authors,

How easy is it to get Lucene to use existing "stop word" and keyword lists, when it is building it's indexes?

Many thanks for putting the time into making this Open Source text retrieval system more accessible to us all.

Cheers,
Bill Bug
Otis Gospodnetic
Author
Greenhorn

Joined: Dec 30, 2004
Posts: 23
Hello Bill,

You can use a custom stop-word list together with StopFilter. StopFilter is a TokenFilter that you can include in your custom Analyzer. Lucene also comes with StopAnalyzer class, which already includes StopFilter, so you'll just have to pass your stop-word String array to its constructor.

These two searches should add more context to my answer:
http://www.lucenebook.com/search?query=stopfilter
http://www.lucenebook.com/search?query=stopanalyzer

Otis


Lucene in Action: http://www.manning.com/lucene
 
 
subject: Lucene - word list usage during index building
 
Threads others viewed
Lucene Hits
desktop search engine like lucene
Lucene in Action
Lucene
Opening a File with the appropriate program with highlighted text
developer file tools

cast iron skillet 49er

more from paul wheaton's glorious empire of web junk: cast iron skillet diatomaceous earth rocket mass heater sepp holzer raised garden beds raising chickens lawn care CFL flea control missoula heat permaculture