This week's book giveaway is in the Design forum.
We're giving away four copies of Design for the Mind and have Victor S. Yocco on-line!
See this thread for details.
Win a copy of Design for the Mind this week in the Design forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

View complete page source

 
omkar wadkar
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
folks , I want to write an application which can give me a complete page source for a given URL ,
considering that the URL has search parameter (ex .http://www.nextag.com/camera/search-html).
I tried using URL class but its still not giving me a complete page source.

I actually want to perform page scrapping but not able to get complete page source .
 
Peter Taucher
Ranch Hand
Posts: 174
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
omkar wadkar wrote:I actually want to perform page scrapping but not able to get complete page source .

First I'd like to say ItDoesntWorkIsUseless and TellTheDetails.

A page rendered by a browser may consist of more than one sources. Also a web server may detect real browser requests and deny others (to discomfort page scrapping). So in my opinion it's often very complex. What do you need this for? What pages are concerned and who's the copyright holder?
 
Ulf Dittmer
Rancher
Posts: 42967
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I recommend to use a library like jWebUnit for programmatic web access. It can download the page and provides various APIs to get at its content.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic