aspose file tools*
The moose likes Other Open Source Projects and the fly likes Combining HtmlUnit and HttpClient Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Products » Other Open Source Projects
Bookmark "Combining HtmlUnit and HttpClient" Watch "Combining HtmlUnit and HttpClient" New topic
Author

Combining HtmlUnit and HttpClient

Matthew Busse
Ranch Hand

Joined: Sep 29, 2010
Posts: 52
Hello Java Geniuses,

I have a question about parsing html pages with Java. I want to use HtmlUnit to parse tables from a webpage that is a response to a POST request. But, I want to make that post request using HttpClient, because making a post request using HtmlUnit is a real pain.

Is it possible to somehow convert the HttpEntity that comes back from the HttpClient post request into an HtmlPage, maybe by going through an input stream?

Or is this question not even make sense?

I want to make a post request like this:


Then do something like this (I know this doesn't work, but is there some way to convert the HttpEntity into an HtmlPage?)


I need to automate a bunch of post requests, but I don't see an easy way to do that using the available HtmlUnit tools.

Is this possible, or do I just need to manually parse the html that comes back in the HttpEntity?

Thanks!
Tim Moores
Rancher

Joined: Sep 21, 2011
Posts: 2408
What kind of access do you find easier to do with HttpClient than with HttpUnit? I think it would be even less than 6 lines of code with HttpUnit.
Matthew Busse
Ranch Hand

Joined: Sep 29, 2010
Posts: 52
Well, maybe I just don't know how to use HtmlUnit. Is there a simple way to make a post request? From the examples I found online, it seems like I have to get the first page, then look through the html to find the name of the form I want to submit, as well as the names of the fields within that form, then set the fields to the values I want and then submit it back to the website. Something like this:



That seems much more awkward than making a post request with HttpClient. That and the fact that trying to run the above code gives me a NoClassDefFound error, pointing at the WebClient, even after checking to make sure I have all the required dependencies...
Tim Moores
Rancher

Joined: Sep 21, 2011
Posts: 2408
Not quite sure I understand (if you use HttpClint, you also need to know the name of the form, and fill in the form parameters), but you could look into WebClient.getWebConnection() - that will generally return a HttpWebConnection, which is the glue between HttpClient and HtmlUnit. You may have to subclass HttpWebConnection in order to get at the HttpClient object, though.
Matthew Busse
Ranch Hand

Joined: Sep 29, 2010
Posts: 52
Thanks for your help. I wound up just digging in and parsing the html response by hand. It was a good learning experience.

Happy New Year!
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Combining HtmlUnit and HttpClient