• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

How to login to javaranch with HtmlUnit?

 
Petar Thomas
Ranch Hand
Posts: 234
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi.

I am not sure if this is really a right forum to post, becouse although I have a problem with understanding how http protocol works, I am using HtmlUnit open source api. Maybe if HtmlUnit is not a problem, someone could tell me what I am doing wrong?

I am making a small application for my self only, it will be something like a notes organiser for learning, and I want to have a 'javaranch reader' as part of it, so that I can customize the way how I am reading javaranch forum, and for other things, like for example, I would like to have my own notes next to some topics, or I would like to make invisible topics that I don't want to read, and I would like to preload more pages so that I can read without delays betwen topcis or pages while loading, but also, even if it is not necessary for me to have it, maybe I would like even to login to a forum, and send posts through my application. This is my first a little bit more seriouse application, on which I am practising, and from which I will have some practical use when I am learning.

- I am getting 404 as a result, when I submit the form, if I submitted it...
I am not sure if I am submitting the form on the right way, I am not sure does my request looks corect, but I don't know how to inspect the request that I am sending, and I don't know how to trap a corect request from the real browser to compare.
( I plan to make a web browser from JDIC, and maybe there I could capture how request looks really, but I am not sure if I will be able to do it with JDIC. )


- My program sends a suspitious request to the next address:


- I have composed that address from the address of the 'login page', and from the action attribute of the form tag from a login page:


Q: If I am sending incorect request, but to the right adress, is it logical that I get 404?

- Next thing that confuses me... In action attribute of a form tag address is "/forums/jforum", yet, after I login with real browser I end up on the address that has somehting to do with :


Q: I don't understand at all how this happens, and is it my responsibility to work something with it? Do I send my request to a wrong address?

-Next thing is name value pairs. In my program my name value pairs look like this before encoding:


And it looks like this after encoding:


- I am putting encoded string into the body of a request, and the only other thing that I do with a request is that I set a post method of a request.

Q: Is this how the body of a request should look like, if I want to login to a javaranch? Is it all right to have a %3D instead of '='?

- I don't know which other questions to ask.

- Source
The program and output is below. Source is the simpliest what I could get.

I am loading the page with HtmlUnit api, page is consisted from DomNode objects. I get a HtmlForm object easily, which is also a DomNode, and then recursively I iterate through child DomNodes and extract only those DomNodes which are also HtmlInput. This is how I get only input elements.

Then I set the values of a 'username' and 'password' element, and from all input elements I compose name value pairs.

I am creating a request with a WebRequest object, and then I am trying to receive a page with a WebClient method getPage(WebRequest). After that I got 404.

-Output
I have planted System.out.println() everywhere in the source. Output is not small.

First I have a bunch of HtmlUnit warnings while loading the first page, login page, but those warnings are all right, no problem with it, I left them, though, in the output.

Then I have printed out things with which I am working, username, pass, and a code of a html form, how it looks on a login page, if it is needed.

Also I have url addresses with which I am working, printed out, if it is needed.

Then I get another bunch og HtmlUnit things, inlcuding the dump out of a 404 page, and also some usual warnings, and then somehow one more time the same 404 page. And then an exception is finnaly thrown.

I also have.. url after sending a request, it would be changed if there was no error, i think so.. And then there should be a text output of a page which I should get after login, but since there was an error, it is still the same page, so I get the output of a login page that didn't changed.



I don't know really how to thank for going through this post with me. Thank You.


SOURCE:






OUTPUT:




If You readed this, really thank You one more time.
Full Respect.

Petar Tomicic


[edit] I have entered some line breakes in the "output" [/edit]
 
Lester Burnham
Rancher
Posts: 1337
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Why don't you start by working on the "reader" part - which does not require a login? It seems you'd have a better understanding of how the site works and how HtmlUnit works once you have that.
 
Petar Thomas
Ranch Hand
Posts: 234
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thank You. You are right.

I am allready onto it, I stumbled upon login and I thought it will be easy. : ) I am analysing the javaranch html code right now, and anyway, I have just enough work with designing of application as well. Thank You : ) for ... how to say (?).. persuading me not to do it now, I allready had enough of it, actually... : ))) bye...

I will mark this post resolved. Maybe it should be erased even?

Bye : ))
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic