File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes Other Open Source Projects and the fly likes HTML Search of web page Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Products » Other Open Source Projects
Bookmark "HTML Search of web page" Watch "HTML Search of web page" New topic

HTML Search of web page

Jehan Jaleel
Ranch Hand

Joined: Apr 30, 2002
Posts: 196
Hi all,
I would like to have add a search functionality to my site. It would allow users would enter a phrase and the entire web site's html would be searched to see if this phrase exists.

Does anyone know of any existing code or module that does this?

Thanks in advance for any help,
Bear Bibeault
Author and ninkuma

Joined: Jan 10, 2002
Posts: 63866

Have you checked out Apache Lucene?

[Asking smart questions] [About Bear] [Books by Bear]
Ulf Dittmer

Joined: Mar 22, 2005
Posts: 42965
When you say "HTML", do you actually mean static HTML files? if so, Apache Nutch might do the trick. If it's actually dynamic content (maybe from a DB or a CMS), then Lucene might be a good choice, or a native search (a SELECT in the case of a DB, or the CMS's built-in search API).
I agree. Here's the link:
subject: HTML Search of web page
It's not a secret anymore!