GeeCON Prague 2014*
The moose likes I/O and Streams and the fly likes View complete page source Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


JavaRanch » Java Forums » Java » I/O and Streams
Bookmark "View complete page source" Watch "View complete page source" New topic
Author

View complete page source

omkar wadkar
Greenhorn

Joined: Jun 13, 2010
Posts: 6
folks , I want to write an application which can give me a complete page source for a given URL ,
considering that the URL has search parameter (ex .http://www.nextag.com/camera/search-html).
I tried using URL class but its still not giving me a complete page source.

I actually want to perform page scrapping but not able to get complete page source .
Peter Taucher
Ranch Hand

Joined: Nov 18, 2006
Posts: 174
omkar wadkar wrote:I actually want to perform page scrapping but not able to get complete page source .

First I'd like to say ItDoesntWorkIsUseless and TellTheDetails.

A page rendered by a browser may consist of more than one sources. Also a web server may detect real browser requests and deny others (to discomfort page scrapping). So in my opinion it's often very complex. What do you need this for? What pages are concerned and who's the copyright holder?


Censorship is the younger of two shameful sisters, the older one bears the name inquisition.
-- Johann Nepomuk Nestroy
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42039
    
  64
I recommend to use a library like jWebUnit for programmatic web access. It can download the page and provides various APIs to get at its content.


Ping & DNS - my free Android networking tools app
 
GeeCON Prague 2014
 
subject: View complete page source