File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Other Application Frameworks and the fly likes emulating a synchronous html request Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of EJB 3 in Action this week in the EJB and other Java EE Technologies forum!
JavaRanch » Java Forums » Frameworks » Other Application Frameworks
Bookmark "emulating a synchronous html request" Watch "emulating a synchronous html request" New topic
Author

emulating a synchronous html request

Fred Scott
Greenhorn

Joined: Aug 20, 2011
Posts: 2
hello, i found this forum entry:
http://www.coderanch.com/t/517587/oa/Web-Crawler-Java

and looked over this link:
http://java-source.net/open-source/html-parsers

I am looking for something unusual. I wish to automatically parse the HTML (and possibly get a screenshot) of some websites that seem to be circumventing asynchronous HTML requests.

Some sites are built so that if you use one of the many asynchronous methods that are provided in many programming examples that you always return a page that doesn't match what is currently there. It is like they are "masking" their page from automated parsing.

(If you attempt to do a GetURL, or LoadStrings or all of the other methods that I have seen, you make an asynchronous request. The sites in question thwart it by rendering content you don't want.)

Is it possible to emulate a synchronous HTML request and still get the benefits of HTML parsing and/or screenshots? Thanks.

Fred Scott
Greenhorn

Joined: Aug 20, 2011
Posts: 2
hi,

i hope i'm not bumping this thread too soon. However, does anyone have any thoughts about my question?

thanks,
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: emulating a synchronous html request
 
Similar Threads
Part 2 Sequence Diagram
Fully asynchronous progressbar
input/output stream
How to explain a 'submit'?
Axis2 client - HTTP status Response