File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes JForum and the fly likes How to login to javaranch with HtmlUnit? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login

Win a copy of Head First Android this week in the Android forum!
JavaRanch » Java Forums » Products » JForum
Bookmark "How to login to javaranch with HtmlUnit?" Watch "How to login to javaranch with HtmlUnit?" New topic

How to login to javaranch with HtmlUnit?

Petar Thomas
Ranch Hand

Joined: Oct 11, 2009
Posts: 234

I am not sure if this is really a right forum to post, becouse although I have a problem with understanding how http protocol works, I am using HtmlUnit open source api. Maybe if HtmlUnit is not a problem, someone could tell me what I am doing wrong?

I am making a small application for my self only, it will be something like a notes organiser for learning, and I want to have a 'javaranch reader' as part of it, so that I can customize the way how I am reading javaranch forum, and for other things, like for example, I would like to have my own notes next to some topics, or I would like to make invisible topics that I don't want to read, and I would like to preload more pages so that I can read without delays betwen topcis or pages while loading, but also, even if it is not necessary for me to have it, maybe I would like even to login to a forum, and send posts through my application. This is my first a little bit more seriouse application, on which I am practising, and from which I will have some practical use when I am learning.

- I am getting 404 as a result, when I submit the form, if I submitted it...
I am not sure if I am submitting the form on the right way, I am not sure does my request looks corect, but I don't know how to inspect the request that I am sending, and I don't know how to trap a corect request from the real browser to compare.
( I plan to make a web browser from JDIC, and maybe there I could capture how request looks really, but I am not sure if I will be able to do it with JDIC. )

- My program sends a suspitious request to the next address:

- I have composed that address from the address of the 'login page', and from the action attribute of the form tag from a login page:

Q: If I am sending incorect request, but to the right adress, is it logical that I get 404?

- Next thing that confuses me... In action attribute of a form tag address is "/forums/jforum", yet, after I login with real browser I end up on the address that has somehting to do with :

Q: I don't understand at all how this happens, and is it my responsibility to work something with it? Do I send my request to a wrong address?

-Next thing is name value pairs. In my program my name value pairs look like this before encoding:

And it looks like this after encoding:

- I am putting encoded string into the body of a request, and the only other thing that I do with a request is that I set a post method of a request.

Q: Is this how the body of a request should look like, if I want to login to a javaranch? Is it all right to have a %3D instead of '='?

- I don't know which other questions to ask.

- Source
The program and output is below. Source is the simpliest what I could get.

I am loading the page with HtmlUnit api, page is consisted from DomNode objects. I get a HtmlForm object easily, which is also a DomNode, and then recursively I iterate through child DomNodes and extract only those DomNodes which are also HtmlInput. This is how I get only input elements.

Then I set the values of a 'username' and 'password' element, and from all input elements I compose name value pairs.

I am creating a request with a WebRequest object, and then I am trying to receive a page with a WebClient method getPage(WebRequest). After that I got 404.

I have planted System.out.println() everywhere in the source. Output is not small.

First I have a bunch of HtmlUnit warnings while loading the first page, login page, but those warnings are all right, no problem with it, I left them, though, in the output.

Then I have printed out things with which I am working, username, pass, and a code of a html form, how it looks on a login page, if it is needed.

Also I have url addresses with which I am working, printed out, if it is needed.

Then I get another bunch og HtmlUnit things, inlcuding the dump out of a 404 page, and also some usual warnings, and then somehow one more time the same 404 page. And then an exception is finnaly thrown.

I also have.. url after sending a request, it would be changed if there was no error, i think so.. And then there should be a text output of a page which I should get after login, but since there was an error, it is still the same page, so I get the output of a login page that didn't changed.

I don't know really how to thank for going through this post with me. Thank You.



If You readed this, really thank You one more time.
Full Respect.

Petar Tomicic

[edit] I have entered some line breakes in the "output" [/edit]
Lester Burnham

Joined: Oct 14, 2008
Posts: 1337
Why don't you start by working on the "reader" part - which does not require a login? It seems you'd have a better understanding of how the site works and how HtmlUnit works once you have that.
Petar Thomas
Ranch Hand

Joined: Oct 11, 2009
Posts: 234
Thank You. You are right.

I am allready onto it, I stumbled upon login and I thought it will be easy. : ) I am analysing the javaranch html code right now, and anyway, I have just enough work with designing of application as well. Thank You : ) for ... how to say (?).. persuading me not to do it now, I allready had enough of it, actually... : ))) bye...

I will mark this post resolved. Maybe it should be erased even?

Bye : ))
I agree. Here's the link:
subject: How to login to javaranch with HtmlUnit?
It's not a secret anymore!