John, how can we accomplish that? Is it by enabling clustering in the web server. So if I want to call a servlet MyServlet on that server, how does the server choose what container on what JVM will respond?
I'm not John, but I'll try to answer anyway.
For
Tomcat, you can simply install a second instance and then modify server.xml and change the port number. Look for the Connector port setting. Server.xml would be in the conf folder under where you installed Tomcat.
At the servlet level servlets don't choose where to respond. They're simply doing a request/response on that single connection. In other words, at the application level there's no need to think about this.
For a servlet container what clustering does is replicate the users session object on each web server. Say you had 4 web servers and you set up clustering using Tomcat's built-in clustering. What happens is that in your code whenever you set an attribute on the session object it gets serialized and sent to each web server. So each web server has a copy of that session object. In the event a web server goes down the load balancer (another piece of this puzzle) will redirect the request to another web server. Since that second web server has a copy of the session object the user won't notice any problem. The request will hit the new server and the session will be there already.
Load balancing is the other part of this. Usually you can have a software load balancer (Apache can be used for this) or a hardware load balancer.
All the load balance basically does is see if a particar web server can repond to the request. If the web server it's trying to talk to is not responding then it redirects to another web server.
Hope this helps.