aspose file tools*
The moose likes I/O and Streams and the fly likes View complete page source Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » I/O and Streams
Bookmark "View complete page source" Watch "View complete page source" New topic
Author

View complete page source

omkar wadkar
Greenhorn

Joined: Jun 13, 2010
Posts: 6
folks , I want to write an application which can give me a complete page source for a given URL ,
considering that the URL has search parameter (ex .http://www.nextag.com/camera/search-html).
I tried using URL class but its still not giving me a complete page source.

I actually want to perform page scrapping but not able to get complete page source .
Peter Taucher
Ranch Hand

Joined: Nov 18, 2006
Posts: 174
omkar wadkar wrote:I actually want to perform page scrapping but not able to get complete page source .

First I'd like to say ItDoesntWorkIsUseless and TellTheDetails.

A page rendered by a browser may consist of more than one sources. Also a web server may detect real browser requests and deny others (to discomfort page scrapping). So in my opinion it's often very complex. What do you need this for? What pages are concerned and who's the copyright holder?


Censorship is the younger of two shameful sisters, the older one bears the name inquisition.
-- Johann Nepomuk Nestroy
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42950
    
  71
I recommend to use a library like jWebUnit for programmatic web access. It can download the page and provides various APIs to get at its content.
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
 
subject: View complete page source