I've got a situation where two sites on the same server, with the same configuration and code are behaving differently. I'm thinking it must have to do with network setup, but I'm not sure where to begin looking.
We have a JSP page that is using a class we developed to open an HttpUrlConnection. The connection is to "itself".
That is, www.abc.com/foo.jsp -> foo.jsp is using our "scrape" library to open another page on www.abc.com, 'scrape' the screen, stick it in the database, where it enters an email queue, and is eventually emailed.
This all works fine on *one* of our deployed sites, but pretty much none of the others... except all the other sites are exactly the same code, and configured in the same way, running on the same instances of apache, tomcat, etc.
The platform is slackware 9.
In my /etc/hosts file, I don't have any entries for the sites in question (so it's not getting any help there). From the shell of the server that is running tomcat, I can ping and telnet to all of the sites.
I can also access the scraped JSP page directly from a browser (so there is no trouble with the page being scraped). But when a JSP page tries to access the JSP page through our scraping library, we get the following trace:The next line in this stack trace is the JSP page that is attempting to scrape another JSP page. The code in the calling JSP page looks like: The first parameter is so I can retrieve the jsessionid, the second parameter is the url to scrape. 3rd and 4th parameters are the begin and end of scraping, and if null, then "whole page". The code in our library is, in part: (line 142 is the last one)
I think my real question is: What does HttpURLConnection use to resolve "www.abc.com" ? Because in once of our sites, it appears to find it, and in pretty much none of our other sites (using the same code) does it find it. Or finding it, it is "refused".
I think where I'm really stuck is that I can telnet to *all* sites, but the code can't seem to do the same. [ January 26, 2005: Message edited by: Mike Curwen ]
The first thing I would do to debug this is run a packet sniffer like tcpflow or Ethereal, catch the headers from both apps and compare them. That will show you exactly what's different between the two.
at java.net.PlainSocketImpl.socketConnect(Native Method)
Also, I see "native method" in that stack trace. Do any of those sun packages use JNI? I know (with Tomcat anyway) there are some issues with libraries that use JNI. The issues sound similar to what you're describing.
If you're using JNI, which those Sun packages might be doing (I can send you a link explaining why you should use them), then there is an issue in Tomcat where the first app that links to the libraries, locks the others out.
I had to deal with that using some JNI based middleware packages. Putting the jar files in a common area (TOMCAT_HOME/shared/lib) instead of under the WEB-INF dir of each app fixed it.
From the Tomcat release notes:
======================= JNI Based Applications: ======================= Applications that require native libraries must ensure that the libraries have been loaded prior to use. Typically, this is done with a call like:
in some class. However, the application must also ensure that the library is not loaded more than once. If the above code were placed in a class inside the web application (i.e. under /WEB-INF/classes or /WEB-INF/lib), and the application were reloaded, the loadLibrary() call would be attempted a second time.
To avoid this problem, place classes that load native libraries outside of the web application, and ensure that the loadLibrary() call is executed only once during the lifetime of a particular JVM.
Well... with a certain amount of chagrin, I can report the following:
I decided to stamp out Ben's possiblity. I took the lib.jar file that our com.acme.web.Util class was in, and snipped out that class. We have other classes in the jar file which are used by context listeners (in WEB-INF/classes), so I can't just move the whole jar file to common/lib (as I discovered, though I think that's a bit borked).
Anyways.. then I put the one com.acme.web class into common/classes and restarted Tomcat. It magically all worked. WOW! maybe Ben was right about the JNI thing.
But I really can't see how I was running into the problem described; afer all, I wasn't calling sun.* classes directly. So to satisfy myself that I could re-produce the error, I re-put the one class back into WEB-INF/classes for all sites and restarted tomcat. now I was getting "Host Not Found" exceptions on ALL sites. **all** sites. WTF??
So then in this sequence: stop tomcat restart httpd start tomcat
Everything now works. ie: I'm back to the com.acme.web class under WEB-INF/classes, and it all works. I think it must have been some Apache/JK unhappiness all along.
I wish I had noticed that you were fronting with Apache and JK. Usually my first tip is to try with Tomcat as a standalone to rule out connector obstinance. I remember from when I used to use Apache in front of Tomcat, there was a particular order you had to follow when starting the two servers, but I can't remember what it was.
I agree, if you're not calling the sun.* classes directly, you shouldn't have to worry about what's in them.
It was the combination of "Native Method" in the stack trace and you saying that only one app would work under Tomcat, even though it was the exact same code that made me think it was a JNI issue.
I hope you've reached the bottom of it.
-Ben [ January 26, 2005: Message edited by: Ben Souther ]