aspose file tools*
The moose likes Servlets and the fly likes Reading web page from servlet Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Servlets
Bookmark "Reading web page from servlet" Watch "Reading web page from servlet" New topic
Author

Reading web page from servlet

William EGreen
Greenhorn

Joined: Mar 18, 2002
Posts: 2
How do I get the HTML text of a given web page from a servlet? (i.e. I need to do some data mining. Also note that the web page in question could require a cookie. I have access to the cookie and can send it to the servlet.)
Thanks,
Bill Green
Jessica Sant
Sheriff

Joined: Oct 17, 2001
Posts: 4313

you could use a java program that access the website, make a request, and writes teh response to a file (thus saving the resulting HTML code).
You might be able to adapt the code from HttpUnit to do just that. It's mean to be a web site Unit testing suite, but you could use it to store the data in the page rather than validating it.
It's an open source project available here:
http://httpunit.sourceforge.net/
Hope that helps.


- Jess
Blog:KnitClimbJava | Twitter: jsant | Ravelry: wingedsheep
Bear Bibeault
Author and ninkuma
Marshal

Joined: Jan 10, 2002
Posts: 61768
    
  67

Check out URLConnection.
hth,
bear


[Asking smart questions] [Bear's FrontMan] [About Bear] [Books by Bear]
Kripal Singh
Ranch Hand

Joined: Jul 26, 2001
Posts: 254
Try using following code


# Help an unprivileged kid.<br /> Whatever u do will make a difference...<br /> ...to a child's life & ur own #<br /><a href="http://www.cry.org/" target="_blank" rel="nofollow">www.cry.org/</a>
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Reading web page from servlet