Meaningless Drivel is fun!*
The moose likes Java in General and the fly likes need the HTML source code.. Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "need the HTML source code.." Watch "need the HTML source code.." New topic
Author

need the HTML source code..

harish raghavan
Greenhorn

Joined: Jan 18, 2006
Posts: 8
Hello people...iam right now developing a page re-ranking algorithm...sorting the results of google on a user query...for that i need to grab the HTML source code of google serach result..
can u plz send me a program on how to grab the html source of google search result page and save it in file..please help me


Code in Binaries...
Bimal Patel
Ranch Hand

Joined: Aug 29, 2003
Posts: 130
Hi,

You mean, you want to grab the source of google's search engine? Please specify!!


Work Hard, Expect The Worst...<br /> <br />Bimal R. Patel<br />(SCJP 1.2, SCWCD 1.4)
Svend Rost
Ranch Hand

Joined: Oct 23, 2002
Posts: 904
Hi..

If your looking for a programmer, you can find one here:
www.rentacoder.com

If you, however, have specific problems we'll gladly help. Please specify
your problem (i.e. "Im not sure how to write data to a file, this code
doesn't work").

/Svend Rost
harish raghavan
Greenhorn

Joined: Jan 18, 2006
Posts: 8
yeah...say for example

http://www.google.co.in/search?hl=en&q=computer+networks&meta=

this is the google result of computer network..i need the HTML source of this stored in a file...please help me
Christophe Verré
Sheriff

Joined: Nov 24, 2005
Posts: 14687
    
  16

I'd like to try that :
http://www.google.com/apis/


[My Blog]
All roads lead to JavaRanch
Rusty Shackleford
Ranch Hand

Joined: Jan 03, 2006
Posts: 490
Just write a very basic web browser that fetches, but doesn't display the page. Then save it.

Or you can just right click on the page and hit view source.


"Computer science is no more about computers than astronomy is about telescopes" - Edsger Dijkstra
Tom Sullivan
Ranch Hand

Joined: Dec 20, 2005
Posts: 72
You might take a look at Sourceforge HTMLParser.

HTMLParser
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Important!

From http://www.google.com/terms_of_service.html


You may not send automated queries of any sort to Google's system without express permission in advance from Google.


The soul is dyed the color of its thoughts. Think only on those things that are in line with your principles and can bear the light of day. The content of your character is your choice. Day by day, what you do is who you become. Your integrity is your destiny - it is the light that guides your way. - Heraclitus
harish raghavan
Greenhorn

Joined: Jan 18, 2006
Posts: 8
oh god..thanks for helping me..really thanks..
Tom Sullivan
Ranch Hand

Joined: Dec 20, 2005
Posts: 72
I know how you must feel Harish...

The last time I had to parse HTML pages I did it with HTMLParser. My program was pretty simple in that I only needed to grab values between <div> and <p> tags. HTMLParser allowed me to use the HTML Pages like an XML DOM. While what I was doing was pretty simple, the API has a number of methods in it that I think will solve your problem (at least as far as getting the HTML info goes which you can then write to file with standard IO).
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: need the HTML source code..