• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Logging into website with Java

 
Ranch Hand
Posts: 38
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I have built a scraper, and I now want it to gather information that is only visible to logged in users.
I have a user name and password, full access to the information while logged in, but I'm having trouble logging in via my application.
So far I have the following code:

I got this from a tutorial about logging into websites with Java, and replaced the URL and login info with my own.
When I run this code, all I get is an outprint of the front page - it doesn't log in.

Could anyone tell me what is wrong here?

PS: I have also tried apache commons and HTMLunit, without getting even a single tutorial to work properly.
 
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hm, too bad about your unsuccessful trials with HtmlUnit, because that is what I would have advocated. In my experience, that's hands down the easiest approach to programmatic web access in Java. But given that you did not get URL/URLConnection to work either, maybe you want to give HtmlUnit another shot? If you post your code, I can take a look at it.
 
Ranch Hand
Posts: 413
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I actually used HttpClient in a lot of projects with no problem.

did you try this:
http://hc.apache.org/httpcomponents-client-ga/httpclient/examples/org/apache/http/examples/client/ClientAuthentication.java
 
Ranch Hand
Posts: 139
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I've found that using a monitor such as Firebug while logging in and browsing can help identify information needed for program login.
 
Kari Nordmann
Ranch Hand
Posts: 38
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Yuriy Fuksenko wrote:I actually used HttpClient in a lot of projects with no problem.

did you try this:
http://hc.apache.org/httpcomponents-client-ga/httpclient/examples/org/apache/http/examples/client/ClientAuthentication.java


I just tried that code. Here's what I get, after replacing the URL, username and pw:

I have no idea of whether I am supposed to replace anything in this line: " new AuthScope("localhost", 443)"
I also have no idea what the output here means. I see that it didn't return any error, but nothing tells me it managed to really log in either.
 
Marshal
Posts: 28133
94
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You wrote your initial code as if the website used basic HTTP authentication. It's pretty uncommon for websites to use that form of authentication these days -- are you sure that was the right thing to do?
 
Kari Nordmann
Ranch Hand
Posts: 38
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Paul Clapham wrote:You wrote your initial code as if the website used basic HTTP authentication. It's pretty uncommon for websites to use that form of authentication these days -- are you sure that was the right thing to do?


Nope, I have no knowledge of different types of HTTP authentification :p
I've seen various code that is supposed to do the same thing, without telling me there are any crucial differences between them.
It wouldn't surprise me if there's some kind of vital bit of knowledge that I lack in order to do these things.
 
Paul Clapham
Marshal
Posts: 28133
94
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Kari Nordmann wrote:It wouldn't surprise me if there's some kind of vital bit of knowledge that I lack in order to do these things.



Well, yeah. There are several ways to authenticate yourself to a web site. Some of those ways are managed by the web server, some of them are managed by the application. Typically these days it's the application which manages the authentication, and if that's the case you just have to mimic the requests your browser sends to the application and handle the responses in the same way. HtmlUnit is a good way to do that. However you might want to spend a while reviewing the possibilities before you make another guess at how it actually works.
 
reply
    Bookmark Topic Watch Topic
  • New Topic