File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes JSP and the fly likes Searching other websites Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » JSP
Bookmark "Searching other websites" Watch "Searching other websites" New topic

Searching other websites

Brian Mulvany

Joined: Oct 19, 2004
Posts: 28
I am doing a project where users will be able to type in the name of a cd and prices and details of the cd from various different shops will be returned. First of all I would like a user to be able to type in what they want and the results of the search from just two websites to be displayed in two frames at the bottom of the page, one frame for website one and the second frame for website two.
I would like to know where to go about doing this.
I would appreciate as much help as possible as Im kinda lost.
Bye now
danny liu
Ranch Hand

Joined: Jan 22, 2004
Posts: 185

You may need HttpConnection and HTML Parser to fulfill that task.

a. use HttpConnection to connect to background web sites, post the search terms and get a html format result.

b. parse answer from that result using a parser.

Hope it helps.

William Brogden
Author and all-around good cowpoke

Joined: Mar 22, 2000
Posts: 13037
The HttpClient toolkit (from the Apache SF Commons project here) is convenient for simulating a browser connection to a website.
Brian Mulvany

Joined: Oct 19, 2004
Posts: 28
Thanks for that
But I went to that site and to be honest I found it all a bit daunting. I was wondering if there was anywhere I could go to learn the basics of HTTP Client. At least I know what I have to do in principle butI dont know how to put it into practice. As I said previously I want a user to be able to type in acd they want and the results to be returned unformatted from a website ( for example)So on my page I want the user to see the exact same information they would see if they went to cdwow and searched for it on their site. I will worry about extracting the exact information I want when I get this first bit working and that is where the Parser comes in as far as I am aware.
Brian Mulvany

Joined: Oct 19, 2004
Posts: 28
I have managed to create several Java classes that search for an album and return the results back as XML but within the Java program. I want the user to be able to use a JSP webpage with a form to be able to search for whatever cd they want. When they type in U2 for example I want U2 to be passed into the Java program. Also I would like the results of the seach that are in XML to be in abrand new XML or HTML file.
Brian Mulvany
Dharmanand Singh

Joined: Oct 27, 2004
Posts: 13
All I could make out from your question is that you want to search a site and probably you have all the html, jsp files that lie there. There are 2 approaches that can be taken in order to search the documents of the site. First is to index all the display data of the html and jsp files on the site and secondly to index the data after crawling the whole site. Now I can give you some idea of using the former method. You will need to parse the html and jsp files and extract the display information from them by employing some logic. After getting the display data, you need to analyse and then index the content and store it. You can then search efficiently on these stored indices and keep on updating these indices whenever there is a change in the site. Now, there are many libraries which will help you to analyse and index the content and perform search on them. I have used one of such libraries: Lucene. You can download an lucene search web-application example from: Download. You can read about the details of this example here. There are certain others which also have a crawler that index the whole site by following the links (as indicated by me as second approach). I haven't used any such library personally but know of one whose documentation are not in English Regain. But I am sure we can find more and try them out. eg. htdig which probably runs on UNIX flavors.
[ December 02, 2004: Message edited by: Dharmanand Singh ]
I agree. Here's the link:
subject: Searching other websites
It's not a secret anymore!