aspose file tools*
The moose likes Java in General and the fly likes Logging into website with Java Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Logging into website with Java" Watch "Logging into website with Java" New topic
Author

Logging into website with Java

Kari Nordmann
Ranch Hand

Joined: Feb 12, 2007
Posts: 38
I have built a scraper, and I now want it to gather information that is only visible to logged in users.
I have a user name and password, full access to the information while logged in, but I'm having trouble logging in via my application.
So far I have the following code:

I got this from a tutorial about logging into websites with Java, and replaced the URL and login info with my own.
When I run this code, all I get is an outprint of the front page - it doesn't log in.

Could anyone tell me what is wrong here?

PS: I have also tried apache commons and HTMLunit, without getting even a single tutorial to work properly.
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41872
    
  63
Hm, too bad about your unsuccessful trials with HtmlUnit, because that is what I would have advocated. In my experience, that's hands down the easiest approach to programmatic web access in Java. But given that you did not get URL/URLConnection to work either, maybe you want to give HtmlUnit another shot? If you post your code, I can take a look at it.


Ping & DNS - my free Android networking tools app
Yuriy Fuksenko
Ranch Hand

Joined: Feb 02, 2001
Posts: 413
I actually used HttpClient in a lot of projects with no problem.

did you try this:
http://hc.apache.org/httpcomponents-client-ga/httpclient/examples/org/apache/http/examples/client/ClientAuthentication.java
John Vorwald
Ranch Hand

Joined: Sep 26, 2010
Posts: 139
I've found that using a monitor such as Firebug while logging in and browsing can help identify information needed for program login.
Kari Nordmann
Ranch Hand

Joined: Feb 12, 2007
Posts: 38
Yuriy Fuksenko wrote:I actually used HttpClient in a lot of projects with no problem.

did you try this:
http://hc.apache.org/httpcomponents-client-ga/httpclient/examples/org/apache/http/examples/client/ClientAuthentication.java

I just tried that code. Here's what I get, after replacing the URL, username and pw:

I have no idea of whether I am supposed to replace anything in this line: " new AuthScope("localhost", 443)"
I also have no idea what the output here means. I see that it didn't return any error, but nothing tells me it managed to really log in either.
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18570
    
    8

You wrote your initial code as if the website used basic HTTP authentication. It's pretty uncommon for websites to use that form of authentication these days -- are you sure that was the right thing to do?
Kari Nordmann
Ranch Hand

Joined: Feb 12, 2007
Posts: 38
Paul Clapham wrote:You wrote your initial code as if the website used basic HTTP authentication. It's pretty uncommon for websites to use that form of authentication these days -- are you sure that was the right thing to do?

Nope, I have no knowledge of different types of HTTP authentification :p
I've seen various code that is supposed to do the same thing, without telling me there are any crucial differences between them.
It wouldn't surprise me if there's some kind of vital bit of knowledge that I lack in order to do these things.
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18570
    
    8

Kari Nordmann wrote:It wouldn't surprise me if there's some kind of vital bit of knowledge that I lack in order to do these things.


Well, yeah. There are several ways to authenticate yourself to a web site. Some of those ways are managed by the web server, some of them are managed by the application. Typically these days it's the application which manages the authentication, and if that's the case you just have to mimic the requests your browser sends to the application and handle the responses in the same way. HtmlUnit is a good way to do that. However you might want to spend a while reviewing the possibilities before you make another guess at how it actually works.
 
It is sorta covered in the JavaRanch Style Guide.
 
subject: Logging into website with Java