• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Logging into website with Java

 
Kari Nordmann
Ranch Hand
Posts: 38
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have built a scraper, and I now want it to gather information that is only visible to logged in users.
I have a user name and password, full access to the information while logged in, but I'm having trouble logging in via my application.
So far I have the following code:

I got this from a tutorial about logging into websites with Java, and replaced the URL and login info with my own.
When I run this code, all I get is an outprint of the front page - it doesn't log in.

Could anyone tell me what is wrong here?

PS: I have also tried apache commons and HTMLunit, without getting even a single tutorial to work properly.
 
Ulf Dittmer
Rancher
Pie
Posts: 42967
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hm, too bad about your unsuccessful trials with HtmlUnit, because that is what I would have advocated. In my experience, that's hands down the easiest approach to programmatic web access in Java. But given that you did not get URL/URLConnection to work either, maybe you want to give HtmlUnit another shot? If you post your code, I can take a look at it.
 
Yuriy Fuksenko
Ranch Hand
Posts: 413
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I actually used HttpClient in a lot of projects with no problem.

did you try this:
http://hc.apache.org/httpcomponents-client-ga/httpclient/examples/org/apache/http/examples/client/ClientAuthentication.java
 
John Vorwald
Ranch Hand
Posts: 139
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I've found that using a monitor such as Firebug while logging in and browsing can help identify information needed for program login.
 
Kari Nordmann
Ranch Hand
Posts: 38
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yuriy Fuksenko wrote:I actually used HttpClient in a lot of projects with no problem.

did you try this:
http://hc.apache.org/httpcomponents-client-ga/httpclient/examples/org/apache/http/examples/client/ClientAuthentication.java

I just tried that code. Here's what I get, after replacing the URL, username and pw:

I have no idea of whether I am supposed to replace anything in this line: " new AuthScope("localhost", 443)"
I also have no idea what the output here means. I see that it didn't return any error, but nothing tells me it managed to really log in either.
 
Paul Clapham
Sheriff
Pie
Posts: 20739
30
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You wrote your initial code as if the website used basic HTTP authentication. It's pretty uncommon for websites to use that form of authentication these days -- are you sure that was the right thing to do?
 
Kari Nordmann
Ranch Hand
Posts: 38
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Paul Clapham wrote:You wrote your initial code as if the website used basic HTTP authentication. It's pretty uncommon for websites to use that form of authentication these days -- are you sure that was the right thing to do?

Nope, I have no knowledge of different types of HTTP authentification :p
I've seen various code that is supposed to do the same thing, without telling me there are any crucial differences between them.
It wouldn't surprise me if there's some kind of vital bit of knowledge that I lack in order to do these things.
 
Paul Clapham
Sheriff
Pie
Posts: 20739
30
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Kari Nordmann wrote:It wouldn't surprise me if there's some kind of vital bit of knowledge that I lack in order to do these things.


Well, yeah. There are several ways to authenticate yourself to a web site. Some of those ways are managed by the web server, some of them are managed by the application. Typically these days it's the application which manages the authentication, and if that's the case you just have to mimic the requests your browser sends to the application and handle the responses in the same way. HtmlUnit is a good way to do that. However you might want to spend a while reviewing the possibilities before you make another guess at how it actually works.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic