wood burning stoves 2.0*
The moose likes Servlets and the fly likes Reading web page from servlet Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Murach's Java Servlets and JSP this week in the Servlets forum!
JavaRanch » Java Forums » Java » Servlets
Bookmark "Reading web page from servlet" Watch "Reading web page from servlet" New topic
Author

Reading web page from servlet

William EGreen
Greenhorn

Joined: Mar 18, 2002
Posts: 2
How do I get the HTML text of a given web page from a servlet? (i.e. I need to do some data mining. Also note that the web page in question could require a cookie. I have access to the cookie and can send it to the servlet.)
Thanks,
Bill Green
Jessica Sant
Sheriff

Joined: Oct 17, 2001
Posts: 4313

you could use a java program that access the website, make a request, and writes teh response to a file (thus saving the resulting HTML code).
You might be able to adapt the code from HttpUnit to do just that. It's mean to be a web site Unit testing suite, but you could use it to store the data in the page rather than validating it.
It's an open source project available here:
http://httpunit.sourceforge.net/
Hope that helps.


- Jess
Blog:KnitClimbJava | Twitter: jsant | Ravelry: wingedsheep
Bear Bibeault
Author and ninkuma
Marshal

Joined: Jan 10, 2002
Posts: 60739
    
  65

Check out URLConnection.
hth,
bear


[Asking smart questions] [Bear's FrontMan] [About Bear] [Books by Bear]
Kripal Singh
Ranch Hand

Joined: Jul 26, 2001
Posts: 254
Try using following code


# Help an unprivileged kid.<br /> Whatever u do will make a difference...<br /> ...to a child's life & ur own #<br /><a href="http://www.cry.org/" target="_blank" rel="nofollow">www.cry.org/</a>
 
wood burning stoves
 
subject: Reading web page from servlet
 
Similar Threads
HttpSession and Cookies!!!
Browser not supplying cookie data in request
Navigational Handler with Servlet or JSP
cookie questions
how to send cookie from one jsp page to another jsp page or to servlet