I am thinking about making a Java application that will need to interface with a web application and submit some information to a website, let the website work on it, and then retrieve the result. I have done some AutoIt3 programming in the past, but it was very basic (pretty much just point and click, checking that windows exist, some if statements, etc.). I just recently started learning Java. I don't have access to an API for the website and honestly, I don't think they even have one.
My goal is to basically make an application that will go/connect to a website, submit some information in regards to health insurance (age, gender, zip code, (non)smoker) and then obtain the rates for the different rate plans when the site finishes quoting it. Afterwards, I plan on using the jXLS API to fill in the information in the appropriate cells in an XLS file so it is ready to print and present to the client.
Is there an elegant way to make this work?
My only idea so far is to use Java to create a GUI where a user can put all the relevant information (age, gender, zip code, (non)smoker) and write all the information to a file. Then have AutoIt3 to open a browser, navigate to the website, login, read the information from the file Java created, point and click on things until the quote is done, finally get the rates information when the quote is done and writing those rates to a file. Afterwards, I can have Java read the information in the file that AutoIt3 created that contains the rates and use that information to write int the XLS file. Like I said... Is there a more elegant way of doing this?
Also, please do note that I am new to Java so I might have followup questions on possible solutions .
When you enter the data in the browser and then press submit somewhere, all the browser does is take the relevant form fields and send them to form's action, either through GET or through POST (the form decides which). You can copy this behaviour by simply sending HTTP requests yourself. Because of things like cookies etc I would try HttpClient for this. You will need to figure out which fields are present and what their values should be. After that you can send an HTTP request, get back the results, and retrieve the data from that. Since the results are most likely an HTML page you will need to parse it to get the exact data you want.
Just one warning - check if the website explicitly disallows this. Some sites like IMDB.com do, and they may block you from accessing the site if you still continue.
Edit: HtmlUnit can probably be used as well instead of HttpClient.
The other thing I would suggest is to take a step back and look at your solution a little more abtractly..While a file based "interface" between the UI and web site and an "interface" which performs screen scraping on the web site may be very workable you may run into issues in the future and be too tightly coupled for changes.
Think in terms of what data do you need to capture as input to the site and define a method in an interface which accepts those parameters for "storage". The implementation of that interface may write to a file, a database, a JMS queue, etc but you'll be free to swap implementations. I may be taking a large leap on you as someone new to java, so I will also suggest you look at something like spring which among other things provides a framework for you to define which implementation is used by your application so it may be swapped at runtime via a config file update (not the only way to accomplish this, but just a quick example).
On the other side, you'll want an interface def for getting data from the website. To your point, they may or may not have an api right now but if they do in the future the approach to getting the data can be modified by creating a new implementation class. Even if you stick with a screen scraping type of approach, theere are different tools you may use for that and you may want flexibility to swap those tools in and out if you find shortcomings, etc.
Brian Burress wrote:... I may be taking a large leap on you as someone new to java...
Yep... You lost me after the first paragraph . But that's OK! You took the time to make a suggestion and even though I don't understand much of it, it is giving me an idea of what sort things are out there and that's awesome! I have no illusions that programming is going to be easy, so I greatly appreciate your input!
I haven't gotten to reading/learning about implementation, etc. All I know is basic OOP concepts, ArrayList, do, if, for, while, and select conditional statements and a tiny bit about exception handling.
Time Moores wrote:I'd use the http://htmlunit.sourceforge.net/ library for this. Check the links in the "How do I..." section on the left to get some idea of how it works, and what it can do.
Thanks! I will definitely check that out!
Rob Spoor wrote:You can copy this behavior by simply sending HTTP requests yourself.
Thank you! I was thinking of something similar too, but I didn't know how to deal with the authentication required to access the website. I will read up on HttpClient and see if I can figure out how to make it work. That is part of the Java API correct?
Thanks again to all for their contribution. You are awesome!
HttpClient is a library from Apache. Check http://hc.apache.org/ for the latest version; I just noticed it's been replaced by HttpComponents, but it should essentially be able to do the same.
Java SE itself comes with HttpURLConnection and CookieHandler / CookieManager but you have to do a bit more work to have it all working together.