| Author |
screen scrape
|
dale con
Ranch Hand
Joined: Apr 15, 2005
Posts: 93
|
|
hi all , can anyone give me an example of screen scraping a website and return the result e.g. html as a string or lead me to some tutorials / examples many thanks
|
 |
marc weber
Sheriff
Joined: Aug 31, 2004
Posts: 11343
|
|
|
Like TESS?
|
"We're kind of on the level of crossword puzzle writers... And no one ever goes to them and gives them an award." ~Joe Strummer
sscce.org
|
 |
Tom Blough
Ranch Hand
Joined: Jul 31, 2003
Posts: 263
|
|
Dale, An old one I did a while ago that screenscraped the USPS Zip+4 info is located at http://www.mycgiserver.com/~tblough/screenscrape.htm. It looks like it doesn't work any more because USPS changed the website from CGI to jsp based pages, but the idea still works. The source is available from a link on the page. Cheers,
|
Tom Blough<br /> <blockquote><font size="1" face="Verdana, Arial">quote:</font><hr>Cum catapultae proscriptae erunt tum soli proscripti catapultas habebunt.<hr></blockquote>
|
 |
Jesper de Jong
Java Cowboy
Bartender
Joined: Aug 16, 2005
Posts: 12911
|
|
Reading the content of a webpage is simple enough with class java.net.URL: After you've done that, you'll have to find the stuff you want to find in the HTML page. You could do it the simple way, with String.indexOf() for example, but maybe that won't be flexible enough. You could try regular expressions, or you could use a HTML parser to walk through the structure of the HTML and find the text you're looking for. Something like http://htmlparser.sourceforge.net/ might be useful for that.
|
Java Beginners FAQ - JavaRanch SCJP FAQ - The Java Tutorial - Java SE 7 API documentation
Scala Notes - My blog about Scala
|
 |
dale con
Ranch Hand
Joined: Apr 15, 2005
Posts: 93
|
|
cheers guys for all your help, much appreciated i know this is a relatively old thing to do but trying to find stuff out there is quite difficult
|
 |
 |
|
|
subject: screen scrape
|
|
|