Win a copy of Re-engineering Legacy Software this week in the Refactoring forum
or Docker in Action in the Cloud/Virtualization forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

emulating a synchronous html request

 
Fred Scott
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hello, i found this forum entry:
http://www.coderanch.com/t/517587/oa/Web-Crawler-Java

and looked over this link:
http://java-source.net/open-source/html-parsers

I am looking for something unusual. I wish to automatically parse the HTML (and possibly get a screenshot) of some websites that seem to be circumventing asynchronous HTML requests.

Some sites are built so that if you use one of the many asynchronous methods that are provided in many programming examples that you always return a page that doesn't match what is currently there. It is like they are "masking" their page from automated parsing.

(If you attempt to do a GetURL, or LoadStrings or all of the other methods that I have seen, you make an asynchronous request. The sites in question thwart it by rendering content you don't want.)

Is it possible to emulate a synchronous HTML request and still get the benefits of HTML parsing and/or screenshots? Thanks.

 
Fred Scott
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hi,

i hope i'm not bumping this thread too soon. However, does anyone have any thoughts about my question?

thanks,
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic