aspose file tools*
The moose likes Sockets and Internet Protocols and the fly likes Pooling outgoing HTTP connections Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Java » Sockets and Internet Protocols
Bookmark "Pooling outgoing HTTP connections" Watch "Pooling outgoing HTTP connections" New topic
Author

Pooling outgoing HTTP connections

Lasse Koskela
author
Sheriff

Joined: Jan 23, 2002
Posts: 11962
    
    5
I'm going to have a need for a utility which lets me make XML request over HTTP POST to several "subsystems" and have a question regarding pooling the physical connections.
First of all, I believe I will need to pool them somehow because (at least on Windows) making more than 20 simultaneous requests cause "Connection refused: connect" errors, which is apparently due to the system not handing out more than 20 sockets (or something similar).
Because of this, I thought of writing a trivial "token-pool" of which size can be configured. The "client" would first acquire a token from this pool (a blocking operation), then make the request, and afterwards return the token back into the pool.
1) Does this sound alright?
2) If I'm going to use HttpUnit or Jakarta Commons HttpClient, do they provide such a pooling utility off-the-shelf?


Author of Test Driven (2007) and Effective Unit Testing (2013) [Blog] [HowToAskQuestionsOnJavaRanch]
Joe Ess
Bartender

Joined: Oct 29, 2001
Posts: 8927
    
    9

Originally posted by Lasse Koskela:

1) Does this sound alright?

Nope. How would your client ask the server for a token unless it opened a socket to the server? And what would that token be, other than a Socket. Without a Socket how would the server know how to connect to the client? Create a reverse server where a client binds to an address, sends that address to the server and waits for the server to make a connection? Yuck. And what if that server gets 20 simultaneous connections? Back where you started.
Database connection pools are a good idea because there are a small number of connections and a large amount of resources make up each connection (~700 on my instance of Oracle for a staff of thousands). The connection pool and its users are on the same machine so making a pool is pretty much wrapping an ArrayList with some access control.
Sockets are small and plentiful (64k per server and you can load balance easily to multiply that). For a server to control access to its ports is non-trivial (as I alluded to above).
There is obviously something wrong with Windows (I mean above and beyond normal). What version are you using? Are you running a firewall (especially one configured to prevent DoS attacks)? Is your server properly multithreaded (i.e. ServerSocket accepts a connection and hands it off to a processor thread so it can get back to accepting connections)?


"blabbing like a narcissistic fool with a superiority complex" ~ N.A.
[How To Ask Questions On JavaRanch]
Lasse Koskela
author
Sheriff

Joined: Jan 23, 2002
Posts: 11962
    
    5
Ok. I might have abused the English language a bit
The actual deployment structure will be like this:
- A billing integration "engine" is running on one server
- A couple of dozen integration modules are running on shared and dedicated servers somewhere (accessible over HTTP)
- The "engine" is the client for the "modules"
- There are a handful of external systems acting as clients for the "engine"
With a "token pool", I meant the following (an example):
- The "engine" receives 30 incoming requests from external systems
- 10 requests are blocking because the token pool only contains 20 tokens
- The requests who have a token each do their magic using URL.openConnection() or something similar
- The requests return their tokens back into the pool after receiving the HTTP response for their URLConnection
- The 10 blocking requests finally get their turn on creating URLConnections as the first 20 requests finish one at a time
Lasse Koskela
author
Sheriff

Joined: Jan 23, 2002
Posts: 11962
    
    5
Oh yeah, regarding the Windows' limit of approximately 20 simultaneous connections, it might or might not be a problem in the production environment as I don't yet know what OS will the system be deployed on (it can be a Windows server, a Solaris, a Linux, or a HP-UX, for example), so I thought I'd better have a plan for tackling the problem if there will be one. If the limit is somewhere in the thousands, I could just configure the token pool size to the same.
Joe Ess
Bartender

Joined: Oct 29, 2001
Posts: 8927
    
    9

Originally posted by Lasse Koskela:
Ok. I might have abused the English language a bit

I didn't get my tag line for making things clear, either.


With a "token pool", I meant the following (an example):

Much clearer. Limiting outgoing connections is a Bad Idea because it doesn't solve the problem and it limits concurrent access to the numerous data servers. You are introducing an artifical bottleneck in order to compensate for a misconfigured server. Your application won't scale. Imagine what happens when 100 users start hitting "refresh" at the same time. And nothing happens. So they all hit "refresh" again. And the connection queue has 200 request in it. And they all hit "refresh again. . . What happens when another app takes 10 of the connections on a server and you try to take 12? Kaboom! Even though you took great care not to overload the server doesn't mean everyone else will.
I vote to Solve The Real Problem. Get your Admin on it. Is it only one of the data servers that's choking?
Lasse Koskela
author
Sheriff

Joined: Jan 23, 2002
Posts: 11962
    
    5
Hmm. I think you've got me convinced. However, the users won't be able to "refresh" because doing so in this scenario would cost them money (access through SMS) or time (redial and queue in an IVR application) or both.
Thanks for the insight, Joe.
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
Note that on Windows, I think it depends on what version you're using. I believe you can only get 20 or simultaneous sockets using regular XP and maybe also XP Pro - but if you're using Windows Server 2003 or some other server version, the limit should be considerably higher. Dunno details; that's just my general recollection. Generally, if you're using a machine as a server, you want it running a real OS, or, failing that, Windows Server rather than XP/Me/98/whatever.


"I'm not back." - Bill Harding, Twister
Lee Schwartz
Greenhorn

Joined: Dec 11, 2003
Posts: 3
Hi,
I may be overlooking somthing here but why even bother opening more than one connection. It seems that your problem of over 20 connections is for a single host. SO why not use a HTTP pipeline and send as many connections over this pipeline as youd like...remember HTTP is stateless, so semantically speaking using 20 connections is no different to using one.
On the other hand there are various HTTP servers such as Simple [1] which can handle an unbounded amount of connections and an unlimited number of requests per HTTP pipeline.
[1] http://simpleweb.sourceforge.net
Lasse Koskela
author
Sheriff

Joined: Jan 23, 2002
Posts: 11962
    
    5
Thanks for the suggestion. However, there will be always more than one server to talk to using HTTP and they are added/removed/relocated on the fly, which would probably cause some complexity in the "tunneling" component. How would a typical web container react if a single TCP connection is used for hours and hours to make a number of HTTP requests?
By the way, both sides (the client and all the servers) will run a J2EE compliant web container (if not a full application server) so we can't use any particular HTTP server.
Lee Schwartz
Greenhorn

Joined: Dec 11, 2003
Posts: 3
Hi Again,
I am assuming this still stands
- A billing integration "engine" is running on ONE server
- A MANY integration "modules" accessable over HTTP
- The "engine" is the client for the "modules"
- There are a handful of external systems acting as clients for the "engine"
You seem to be using a "push" communication model. There are numerous examples of why this does not scale well. Why not try a "pull" communication model. This solves all your problems...it think.
For example lets say the "engine" (as you have refered to it) is the single portal for the "modules", which as far as I have understood your explanation are numerous. So if there are numerous "modules" and all are moving on the fly why not have them contact the "engine" like the SMS clients do. Ill explain...dont want to get too deep here.
1) If you used Jini as a means to discover the "modules" and the "engine", then the movement of the "modules" is not a problem. This means that both your "modules" and "engine" are Jini services of some kind.
2) Once a Jini infrastructure is established you need the "modules" to actively discover the "engine". This means that there is no need for dedicated servers all are free to move at any time.
3) Once a "module" has contacted the "engine" the engine passes on a request and cloese the connection. The module now has the request and is processing it ASYNCHRONOUSLY, that is, there is no open socket at this time.
4) When the "module" has finished processing the request it contacts the "engine" with the result.
5) The "engine" can then build up the results it recieves from the "modules"
as replys to the SMS clients.
As far as I know SMS is connectionless so open sockets are no longer a problem. If XP cant handle requests like this then there is somthing seriously wrong! By the way try Solaris it is build especially for this, and although I not a big fan of it is seriously good at it.

that would certanily solve the movement of the "modules". So when the "modules" change address or start up on a new host they use Jini discovery to discover where the "engine" is located. Once t
Lasse Koskela
author
Sheriff

Joined: Jan 23, 2002
Posts: 11962
    
    5
Thanks again. Using Jini would be an interesting exercise. Still, the restrictions are that the applications must run on a web container and that all communication between nodes must happen over HTTP (one reason is the network infrastructure, another is that some parts of the system are written in PHP). The push/pull question is something I'll definitely have to look into.
Lee Schwartz
Greenhorn

Joined: Dec 11, 2003
Posts: 3
I see... Jini is really just removes headaches. If the "engine" is static you could simply have the "modules" do somthing like this
new URL("www.engine.com/PullServlet").openConnection(); // in Java
fsockopen("www.engine.com/PullServlet", 80); # in PHP
This could send a POST to the "engine" and retrieve a HTTP response. If you set the connection to have the HTTP connection close semantics the socket will close once the transaction has finished. Or better still you could send several SMS requests within the response from the "engine" to save the cost of opening a TCP connection.
If it is servlets your using then they can use the ServletContext as an IPC mechanism to pass information back and forth so that the SMS clients do something like this
new URL("www.engine.com/PushServlet").openConnection();
 
It is sorta covered in the JavaRanch Style Guide.
 
subject: Pooling outgoing HTTP connections