What is the ideal design for server process in Linux that handles concurrent socket I/O? How do you handle incoming request? What if multiple clients start requesting at same time? How can I optimize scalability?
If you distill this down to its essence—what are my threading and I/O model options for a scalable socket-based server?—this is a great question.Engineers tend to look at this as a threading design question—how many threads?—but the real question is, how do you want to do I/O? Everything else falls out from that.You have four basic I/O models:Process per request, creating the processes via fork or handing off to a process pool, with blocking I/O.A thread per connection, with blocking I/O.A pool of threads, each thread handling multiple connections, with asynchronous I/O.A pool of threads, each thread handling multiple connections, event-driven, multiplexed via select or poll and nonblocking I/O.Let's go through each of these.In the beginning, there was process-per-request. This is the classic Apache model. Apache waits on a listener socket and, with each accepted connection, calls fork to handle the request from start-to-end. The process performs blocking I/O. This model is simple to understand and easy to program. Blocking I/O is intuitive, it is what most people do. The downside of this model is the overhead of creating and maintaining all of those processes. This is somewhat mitigated by having a process pool instead of calling fork on each accept, but you still waste a lot of memory. Worse, it doesn't scale: A modern machine can easily handle 1000s or even 10s of 1000s of requests in flight, but that can be a lot of processes to create and manage.Threads are an obvious way to improve the scalability of the process-per-request model. Enter the thread-per-connection model. With this approach, you create a new thread for each incoming connection (or serve it with an existing thread from a pool), and the thread then processes the connection from start-to-end. As with the process model, it may issue blocking I/O and generally do whatever it wants. Aside from the new risk of race conditions to shared data, the threading model is just like the process model. It is simple to understand and easy to program.Unfortunately, it still doesn't scale well. Managing 1000s or even 10s of 1000s of threads is more palatable than the selfsame modulo processes, but that is still a lot of threads. What do we do? How do we scale to the very large number of connections a modern machine can handle without using so many damn threads or processes? The key is in an obvious observation. The threads (or processes) spend a lot of time waiting on I/O: Reading and writing to the socket, reading files (for static serving), waiting for backend RPCs or database requests to return (for dynamic serving). With non-blocking I/O and an event-driven model to break up request processing into a series of callbacks, each thread can service multiple requests. Indeed, with good control flow, a well-designed server wouldn't need more threads than the number of processors on the system. Instead of N threads for N requests-in-flight, a server might only need 8 or 12 or whatever threads.(History skipped over the third option above, the asynchronous I/O solution. Most models eschew this approach as many engineers despise asynchronous I/O and the async_io solution in Linux doesn't win any awards. Asynchronous I/O works best when it has language-level support. The Go language is a promising contender.)Enter the preferred solution: A pool of threads, where each thread handles multiple connections. I/O is multiplexed via select or poll and is non-blocking. Request processing is broken up into a series of callbacks. This is the so-called event-driven model that is quite popular right now. Web servers such as Nginx, node.js, and Tornado utilize this model.If writing a server today, I'd look at systems such as node.js to see if they satisfy your needs. If not, consider an event-driven, non-blocking model with a pool of threads where each thread handles multiple connections. You'll scale best with such an approach.