• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Java program that interacts with the web

 
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

I am trying to create a java application that will interact with websites. For example my application may have to navigate to a certain website, extract the text on the page, compute results, fill up a form and submit. Can anyone tell me what is the best way to go about making such a system? Would i have to create teh components that speak http or https or do apis exist?

I came across HTMLunit api which is primarily used to test and java browsers like lobo and jrex that seem to have an api too. How do these compare?

Thanks!
Eric
 
Ranch Hand
Posts: 74
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Apache HTTP Client is the first thing to check: http://hc.apache.org/httpcomponents-client/index.html
Also, for HTML processing you have http://htmlparser.sourceforge.net/
 
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The premier library for this is jWebUnit, IMO. No need to deal with HTTP or HTML on a low level, that's all been done before. Don't be put off that it's billed a "unit testing tool" - it works just fine as a general-purpose web access library.
 
Marshal
Posts: 79178
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
And welcome to JavaRanch
 
Eric Klytzmany
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks for the replies. I will check them out and get back.
One more thing here, is anyone aware of a similar api that might support interactions with applets as well? The reason i ask is because a large number of sites i will need to perform these functions on might have the content as applets. I know extraction of any text from an applet is going to be tough, but is it even possible? what about interactions on the applet like button clicks?

 
Sheriff
Posts: 22783
131
Eclipse IDE Spring VI Editor Chrome Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You may also check out this thread as it is about roughly the same subject.
 
Ulf Dittmer
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
That's tough. From within the same JVM, the java.awt.Robot class could be used to control a GUI to a certain extent, but from a different JVM that would be much harder. Going out on a limb, I'd say it's impossible to do in the general case where you don't know the applet beforehand. And even if the applet GUI is known, extracting text that was painted on the screen amounts to OCR; I foresee numerous hard problems that way.
 
Eric Klytzmany
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
hmm.. ok here is another idea, ideally all data being displayed by the applet too is coming in through a socket connection made by the browser right. So if i made the browser (or used an api that is a mock browser) I would have access to the data flowing in and out of the applet. And if that is the case, this data would follow a definite pattern and can be extracted, unless the data is encrypted.

Is this even possible and has someone attempted this?
 
Consider Paul's rocket mass heater.
reply
    Bookmark Topic Watch Topic
  • New Topic