Development

It seems that C10K is old hat these days and that people are aiming a little higher. I’ve seen several questions on StackOverflow.com (and had an equal number of direct emails) asking about how people can achieve one million active TCP connections on a single Windows Server box. Or, in a more round about way, what is the theoretical maximum number of active TCP connections that a Windows Server box can handle.

I’ve spent the last few days implementing the WebSocket protocol (well, two versions of the draft standard actually) and integrating it into an existing server for one of our clients. This has proved to be an interesting exercise. The protocol itself is pretty simple but, as ever, the devil is in the detail. I now have server side code that deals with both the Hixie 76 draft and the HyBi 03 draft of the protocol.

As the recent spate of bug fix and patch releases shows I’m not scared of talking about the bugs that I find in the code of The Server Framework and pushing fixes out quickly. It’s my belief that the most important thing to get out of a bug report is an improved process which will help prevent similar bugs from occurring in future and the only way to achieve that is to be open about the bugs you find and equally open about how you then address them and try and prevent similar issues.

As I mentioned last time, supporting a large number of concurrent connections on a modern Windows operating system is reasonably straight forward if you get your initial design right; use an I/O Completion Port based design, minimise context switches, data copies and memory allocation and avoid lock contention… The Server Framework gives you this as a starting point and you can often use one of the many, complete and fully functionaly, real world example servers to provide you with a whole server shell, complete with easy performance monitoring and SSL security, where you simply have to fill in your business logic.

Using a modern Windows operating system it’s pretty easy to build a server system that can support many thousands of connections if you design the system to use the correct Windows APIs. The key to server scalability is to always keep in mind the Four Horsemen of Poor Performance as described by Jeff Darcy in his document on High Performance Server Architecture. These are: Data copies Context switches Memory allocation Lock contention I’ll look at context switches first, as IMHO this is where outdated designs often rear their head first.

As I mentioned in the release notes for v6.3 here, I’ve added some code to prevent potential recursion issues if certain performance improvements are enabled. In Windows Vista and later it’s possible to set the FILE_SKIP_COMPLETION_PORT_ON_SUCCESS flag on a socket using SetFileCompletionNotificationModes(). When this flag is set an overlapped operation can complete “in-line” and the completion operation can be handled on the thread that issued the operation rather than on one of the threads that is servicing the IO completion port that is associated with the socket.

I’m currently re-reading “High Performance Server Architecture” by Jeff Darcy and he has a lot of sensible stuff to say about avoiding context switches and how my multiple thread pool design, whilst conceptually good is practically not so good. In general I agree with him but often the design provides good enough performance and it’s easy to compose from the various classes in The Server Framework. Explicitly managing the threads that could run, using a semaphore that only allows a number of threads that is equal to or less than your number of cores to do work at once is a nice idea but one that adds complexity to the workflow as you need to explicitly acquire and release the semaphore as you perform your blocking operations.

OpenSSL is an open source implementation of the SSL and TLS protocols. Unfortunately it doesn’t play well with windows style asynchronous sockets. This article - previously published in Windows Developer Magazine and now available on the Dr. Dobbs site - provides a simple connector that enables you to use OpenSSL asynchronously. Integrating OpenSSL with asynchronous sockets is similar to integrating it with overlapped I/O and IO completion port based designs and so the ideas behind the code discussed in the article were then used as part of the original design for The Server Framework’s OpenSSL option pack.

One of my clients has recently required .Net 4.0 hosting support and so most of the changes in the CLR Hosting Tools library in 6.3 have been driven by them. The main new feature is the optional use of the .Net 4.0 hosting API. This allows us to host .Net 4.0 as well as earlier CLRs and also allows us to host multiple different CLRs in a single process. The new hosting API is supported via the CCLRHostFactory object which uses the CCLRMetaHost and CCLRRuntimes objects to present a consistent interface to both the .

The development of WASP has been acting as a bit of an internal driver for new feature development in the 6.3 release of The Server Framework. Sitting down to develop a service that was easy to use for a mass market exposed some small holes in the 6.2 release; nothing too serious but pretty soon after putting together the first service shell of the WASP application I had a list of nice to have additions for the Service Tools Library.

One Million TCP Connections...

The WebSocket protocol

My approach to bugs

How to support 10,000 or more concurrent TCP connections - Part 2 - Perf tests from Day 0

How to support 10,000 or more concurrent TCP connections

Testing complex server code

Some thoughts on that two thread pool server design

Using OpenSSL with Asynchronous Sockets

Changes to the CLR Hosting Tools library in 6.3

Changes to the Service Tools library in 6.3