Scaling with Single Threading
The free lunch is over. To speed up applications we are told we must write multithreaded programs and avoid mutable state. Functional programming can help with it’s immutable state. There’s also Erlang with the Actor model or Clojure with it’s software transactional memory. One other option to consider is single threading your code. Following are a few examples where single threading was used to scale applications.
When Sun built their new GUI toolkit, they choose to single thread all GUI events thru an event dispatch thread. One of the reasons they had for choosing this architecture was low overhead; toolkits that use threads can spend a substantial amount of time and space managing locks.
Node.js is the latest hot technology. Most web and application servers spawn a new thread for each connection. Each of those threads have memory overhead, which limits the number of connections a server can handle. Instead of threads, Node.js uses a single I/O event queue and fires an event handler for each connection. All user code is run in a single thread on the event queue. This limits the memory footprint of the server and minimizes the chances of deadlock. Nginx is a web server built using this architecture and has shown some amazing performance with little overhead.
For those who prefer sticking with Java, there is an attempt to port Node.JS to Java. Another alternative is to build your server using Netty, which an asynchronous event-driven network application framework built using NIO.
LMAX is a retail financial trading platform built using Java. A trading application like this needs to very low latency, there will be a large volume of trades that need to be processed quickly. The LMAX team ended up running all the business logic on a single thread with all of the data in memory. This single thread processes 6 million orders per second using commodity hardware. Disruptors receive messages and unmarshall, replicate and journal the input messages, then pass them along to the single threaded business logic processor. When the business logic processor completes the message, it passes it to an output distruptor that publishes the results. Martin Fowler wrote an excellent article on this architecture. I found the following footnote interesting:
An interesting side-note. While the LMAX team shares much of the current interest in functional programming, they believe that the OO approach provides a better approach for this kind of problem. They’ve noticed that as they work to write faster code, they move away from a functional style towards OO style. Partly this because of the copying of data that functional styles require to maintain immutability. But it’s also because objects provide a better model of a complex domain with a richer choice of data structures.
While a single threaded architecture may not fit every application, it is an important architecture to keep in your tool belt. If you are having trouble scaling your application, it is worth considering these examples.