If there are 100 requests but, server can handle 10 requests at a time. Then how to handle remaining requests. Every request should be processed. What are the different ways to achieve this? (Is it handled by application server itself?)
Typically the web server, but the lines have been blurred over that it is possible for the application to manage too.
If supported, the server would have a configurable number of worker and queue threads. Workers are active threads, serving requests. When these are all taken up, requests are passed to the queue threads to hold the requests until a worker comes available. Once both workers and queue threads are all in use, requests are rejected until the system catches up. It is also possible for requests to leave the queue before getting to a worker if the client closes the connection.