• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Not able to retrieve data for Secured site

 
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
hi
There is a secure site:
https://farm3.sat.gob.gt/saqbe-arancel-publico/aduana/arancel/consulta/consulta.jsf
After the landing page is retrieved, just provide 01 in Posición arancelaria text field and click on buscar. I am able to retrive this page via httpunit api. but after that CLASIFICACIÓN ARANCELARIA page... on clicking of 0101.10.90, i m not able to retrieve that page via Java code.
Please help me asap.
I have tried the following way to retrieve data:

1) Setting parameters into WebRequest:

//Retrieving webForm from the response of landed page secured site:
WebForm webForm = response.getFormWithID("formar");

//Then retrieve the request from the webForm and setting all relevant parameters associated with particular hsCodes:

WebRequest req = webForm.getRequest();
req.setParameter("formar_SUBMIT","1");
req.setParameter("jsf_sequence","3");
req.setParameter("formar:_link_hidden_","");
req.setParameter("pCodigo","01019000");
req.setParameter("formar:_idcl","formar:tree:0:0:1:parter2");
WebResponse resp = webForm.submit();

But we have redirected again to the same landing page from which we want to retrieve subsequent page for a HsCode.

We have also tested it after setting the following properties :

ClientProperties props = webConversation.getClientProperties();
props.setUserAgent("Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.7) Gecko/20091221 Firefox/3.5.7");
props.setAcceptCookies(true);
props.setAutoRedirect(false);
props.setAutoRefresh(false);

But still we are not able to achieve desired response.

2) Retrieving data via WebLink:

//Retrieve link from response for a particular hsCode
WebLink link = response.getLinkWith("0101.10.10");

//Get the webrequest for a particular link.
WebRequest req = link.getRequest();

//Now webConversation API must return the response.
WebResponse resp = webConversation.getResponse(req);

But the response don't contain the subsequent page associated with particular HsCode.

3)Setting Request headers with Session fields and other parameters:

At first We Retrieve HeaderFields from rquest and response in java code and comparing these fields to HttpAnalyzer header fields.
Now setting the parameters which are alike or don't exist in request.

req.setHeaderField("ORA_WX_SESSION", "10.1.0.34:47873-2#3");
req.setHeaderField("Keep-Alive","300");
req.setHeaderField("Connection","keep-alive");
req.setHeaderField("JSESSIONID", "0a01002230d7760c3064f5d54d71b35e84bd606926fc.e3aOaNuPbhiQe3uMc3mPbNaRbO0");

But still the same response has been retrieved.

4)Using HttpsURLConnection:

We have opened a Url connection and then retrieving response

HttpsURLConnection connection = (HttpsURLConnection) url.openConnection();
connection.setDoInput(true);
connection.setDoOutput(true);
String response = showText(con.getInputStream());

Via this way we are able to retrieve response for the landing page, but for subsequent pages its still a challenge to retrieve data.

We have also tried with setting some other options with HttpsURLConnection like as the example in following link:
http://www.java-samples.com/java/POST-toHTTPS-url-free-java-sample-program.htm

System.setProperty("java.protocol.handler.pkgs","com.sun.net.ssl.internal.www.protocol");
java.security.Security.addProvider(new com.sun.net.ssl.internal.ssl.Provider());
SSLSocketFactory factory = (SSLSocketFactory) SSLSocketFactory.getDefault();
connection.setSSLSocketFactory(factory);
connection.setFollowRedirects(true);

5)Using HttpClient:

HttpClient client = new HttpClient();
HttpMethod method = new PostMethod("https://farm3.sat.gob.gt/saqbe-arancel-publico/aduana/arancel/consulta/consulta.jsf");
client.executeMethod(method);
byte[] body = method.getResponseBody();
InputStream is = new ByteArrayInputStream(body);
String resp = showText(is);
method.releaseConnection();

6) Using SSL implementation:
As per the Url describes http://httpunit.sourceforge.net/doc/sslfaq.html
We have downloaded the JSSE package from the Sun URL and placed the key jar files into JVM ext directory, and updated java.security file
and then we have used all the options described above, but still the same story has been repeated.

7)Using HttpAnalyzer:
We have also used HttpAnalyzer and some other tools also to analyze the web content but HttpAnalyzer is not able to trace data when Url connection has been done via eclipse for a secured site.

So still its remaining a challenge for retrieval of tariff data for GT.
 
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I would use an HTTP proxy like tcpmon to study how accesses through the browser differ from accesses done programmatically by your code; there must be a difference somewhere.
 
Himanshu Agrawal
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi
Thanks for your response but we have used many tools like Httpanalyzer etc but it didn't help.
 
Ulf Dittmer
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
What does "it didn't help" mean? Are you saying that the requests were bit for bit identical? I find that hard to believe.
 
Himanshu Agrawal
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Ulf Dittmer wrote:What does "it didn't help" mean? Are you saying that the requests were bit for bit identical? I find that hard to believe.



Hi Ulf,
I have written the following code as pe your suggestion.. i.e passing bit for bit identical request

ClientProperties props = webConversation.getClientProperties();

props.setUserAgent("Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.7) Gecko/20091221 Firefox/3.5.7");
props.setAcceptCookies(true);

String url = "https://farm3.sat.gob.gt/saqbe-arancel-publico/aduana/arancel/consulta/consultaCapitulo.jsf";
WebRequest req = new PostMethodWebRequest(url);

req.setHeaderField("Host", "farm3.sat.gob.gt");
req.setHeaderField("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.7) Gecko/20091221 Firefox/3.5.7");
req.setHeaderField("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
req.setHeaderField("Accept-Language", "en-us");
req.setHeaderField("Accept-Encoding", "gzip,deflate");
req.setHeaderField("Accept-Charset", "ISO-8859-1,utf-8;q=0.7,*;q=0.7");
req.setHeaderField("Keep-Alive","300");
req.setHeaderField("Connection","keep-alive");
req.setHeaderField("Referer","https://farm3.sat.gob.gt/saqbe-arancel-publico/aduana/arancel/consulta/consulta.jsf");
req.setHeaderField("Cookie","ORA_WX_SESSION=10.1.0.34:47873-2#3; tree=0%3A0%3Dx%3B0%3A0%3A0%3Dx; JSESSIONID=0a01002230d7ff823b331f984a638e24cdee6e54c0d3.e3aOaNuPbhiQe3uMc3mPbNeSbO0");
req.setHeaderField("Content-Type","application/x-www-form-urlencoded");
req.setHeaderField("Content-Length","122");

req.setParameter("formar_SUBMIT","1");
req.setParameter("jsf_sequence","5");
req.setParameter("formar:_link_hidden_","");
req.setParameter("pCodigo","02011000");
req.setParameter("formar:_idcl","formar:tree:0:0:0:parter2");

WebResponse resp = webConversation.getResponse(req);
saveAsHtml(resp.getText());

But no luck to retrieve the response.
 
Ulf Dittmer
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
How are you making sure that the requests are, in fact, bit for bit identical? Which HTTP proxy/monitor are you using?
 
Himanshu Agrawal
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Ulf Dittmer wrote:How are you making sure that the requests are, in fact, bit for bit identical? Which HTTP proxy/monitor are you using?


I am using HttpAnalyzer available @
http://www.ieinspector.com/
 
Ulf Dittmer
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Does the page you're getting back contain an error message, or other indication what might be going wrong?

But fundamentally, I'd use a library like HtmlUnit to go from page to page; that way you don't need to handle all the parameters and cookies in your code, but can let the library do it for you.
 
Sheriff
Posts: 22783
131
Eclipse IDE Spring VI Editor Chrome Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Himanshu, could you please UseCodeTags next time?
 
Himanshu Agrawal
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Ulf
Thanks for your suggestion. The problem got solved using Htmlunit.
 
Himanshu Agrawal
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi
I was able to use htmlunit 2.6 for downloading data from website using jdk1.6 but the new restriction came into place is that we could use only jdk1.4 or below. So I had to port my application on jdk 1.4 but since htmlunit 2.6 don't support jdk1.4 so i have switched to previous version of htmlunit like i have tried to use htmlunit 1.14.
But while running through htmlunit 1.14 with jdk1.4 it throws following error:

27 Feb 2010 00:07:25,289] - INFO [main]: Requesting given URL. =https://farm3.sat.gob.gt/saqbe-arancel-publico/aduana/arancel/consulta/consulta.jsf
[27 Feb 2010 00:09:39,997] - ERROR [main]: Exception while initializing JavaScript for the page
org.mozilla.javascript.EvaluatorException: Can not get field 'ELEMENT_NODE' for type: Node
at org.mozilla.javascript.DefaultErrorReporter.runtimeError(DefaultErrorReporter.java:98)
at org.mozilla.javascript.Context.reportRuntimeError(Context.java:966)
at org.mozilla.javascript.Context.reportRuntimeError(Context.java:1022)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.configureConstantsPropertiesAndFunctions(JavaScriptEngine.java:314)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.configureClass(JavaScriptEngine.java:292)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.init(JavaScriptEngine.java:212)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.access$000(JavaScriptEngine.java:96)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$1.run(JavaScriptEngine.java:156)
at org.mozilla.javascript.Context.call(Context.java:519)
at org.mozilla.javascript.Context.call(Context.java:450)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.initialize(JavaScriptEngine.java:167)
at com.gargoylesoftware.htmlunit.WebClient.initialize(WebClient.java:1084)
at com.gargoylesoftware.htmlunit.WebWindowImpl.setEnclosedPage(WebWindowImpl.java:115)
at com.gargoylesoftware.htmlunit.html.HTMLParser.parse(HTMLParser.java:238)
at com.gargoylesoftware.htmlunit.DefaultPageCreator.createHtmlPage(DefaultPageCreator.java:116)
at com.gargoylesoftware.htmlunit.DefaultPageCreator.createPage(DefaultPageCreator.java:89)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:450)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:359)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:407)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:395)
at com.mycustoms.dataload.tools.crawl.AbstractWebCrawler.getResponseHtml(AbstractWebCrawler.java:221)
at com.mycustoms.dataload.tools.gt.GTWebCrawler.crawl(GTWebCrawler.java:86)
at com.mycustoms.dataload.tools.crawl.AbstractWebCrawler.run(AbstractWebCrawler.java:88)
at com.mycustoms.dataload.tools.gt.GTWebCrawler.main(GTWebCrawler.java:56)

Could you please suggest the reason
 
Himanshu Agrawal
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi
I am using following code for this:

As soon as it tries to retrieve response for the url, it throws above exception in the last post.
 
Ulf Dittmer
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I'd advise to push back on the Java 1.4 requirement. That's been obsolete for years. Heck, even Java 5 is EOL'd now. Besides the fact that you won't get security fixes, it's becoming less and less likely that you'll be able to get help online, since fewer and fewer developers use it.
 
Consider Paul's rocket mass heater.
reply
    Bookmark Topic Watch Topic
  • New Topic