jQuery in Action, 3rd edition
The moose likes Java in General and the fly likes need the HTML source code.. Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "need the HTML source code.." Watch "need the HTML source code.." New topic

need the HTML source code..

harish raghavan

Joined: Jan 18, 2006
Posts: 8
Hello people...iam right now developing a page re-ranking algorithm...sorting the results of google on a user query...for that i need to grab the HTML source code of google serach result..
can u plz send me a program on how to grab the html source of google search result page and save it in file..please help me

Code in Binaries...
Bimal Patel
Ranch Hand

Joined: Aug 29, 2003
Posts: 130

You mean, you want to grab the source of google's search engine? Please specify!!

Work Hard, Expect The Worst...<br /> <br />Bimal R. Patel<br />(SCJP 1.2, SCWCD 1.4)
Svend Rost
Ranch Hand

Joined: Oct 23, 2002
Posts: 904

If your looking for a programmer, you can find one here:

If you, however, have specific problems we'll gladly help. Please specify
your problem (i.e. "Im not sure how to write data to a file, this code
doesn't work").

/Svend Rost
harish raghavan

Joined: Jan 18, 2006
Posts: 8
yeah...say for example


this is the google result of computer network..i need the HTML source of this stored in a file...please help me
Christophe Verré

Joined: Nov 24, 2005
Posts: 14688

I'd like to try that :

[My Blog]
All roads lead to JavaRanch
Rusty Shackleford
Ranch Hand

Joined: Jan 03, 2006
Posts: 490
Just write a very basic web browser that fetches, but doesn't display the page. Then save it.

Or you can just right click on the page and hit view source.

"Computer science is no more about computers than astronomy is about telescopes" - Edsger Dijkstra
Tom Sullivan
Ranch Hand

Joined: Dec 20, 2005
Posts: 72
You might take a look at Sourceforge HTMLParser.

Ilja Preuss

Joined: Jul 11, 2001
Posts: 14112

From http://www.google.com/terms_of_service.html

You may not send automated queries of any sort to Google's system without express permission in advance from Google.

The soul is dyed the color of its thoughts. Think only on those things that are in line with your principles and can bear the light of day. The content of your character is your choice. Day by day, what you do is who you become. Your integrity is your destiny - it is the light that guides your way. - Heraclitus
harish raghavan

Joined: Jan 18, 2006
Posts: 8
oh god..thanks for helping me..really thanks..
Tom Sullivan
Ranch Hand

Joined: Dec 20, 2005
Posts: 72
I know how you must feel Harish...

The last time I had to parse HTML pages I did it with HTMLParser. My program was pretty simple in that I only needed to grab values between <div> and <p> tags. HTMLParser allowed me to use the HTML Pages like an XML DOM. While what I was doing was pretty simple, the API has a number of methods in it that I think will solve your problem (at least as far as getting the HTML info goes which you can then write to file with standard IO).
I agree. Here's the link: http://aspose.com/file-tools
subject: need the HTML source code..
It's not a secret anymore!