Asynchronous Events

How to support 10,000 or more concurrent TCP connections - Part 2 - Perf tests from Day 0

As I mentioned last time, supporting a large number of concurrent connections on a modern Windows operating system is reasonably straight forward if you get your initial design right; use an I/O Completion Port based design, minimise context switches, data copies and memory allocation and avoid lock contention… The Server Framework gives you this as a starting point and you can often use one of the many, complete and fully functionaly, real world example servers to provide you with a whole server shell, complete with easy performance monitoring and SSL security, where you simply have to fill in your business logic.

How to support 10,000 or more concurrent TCP connections

Using a modern Windows operating system it’s pretty easy to build a server system that can support many thousands of connections if you design the system to use the correct Windows APIs. The key to server scalability is to always keep in mind the Four Horsemen of Poor Performance as described by Jeff Darcy in his document on High Performance Server Architecture. These are: Data copies Context switches Memory allocation Lock contention I’ll look at context switches first, as IMHO this is where outdated designs often rear their head first.

WASP command line options

As you’ve seen from some of the earlier tutorials, WASP has quite a few command line parameters that can change how it runs. You can run WASP as a normal executable or install it as a Windows Service. The complete set of command line options are displayed if you run WASP with /help or with an option that it doesn’t understand but I thought I’d list them all here for completeness and so that I can explain in a little more detail what each one does.

More complex message framing

So far the tutorials have focused on a simple length prefixed message type. This is probably the easiest message in the world to process, the message framing is very simple and there’s hardly anything to do in your message framing DLL. Unfortunately not all protocols are this simple to parse. Another common real-world protocol is a line based protocol that is delimited by a terminating character, or characters. One such protocol is the POP3 protocol which works in terms of commands which are delimited by the CR LF sequence.

WASP's config file

As you have discovered if you’ve been following the tutorials, WASP is configured using an XML file. This file can either live in the same directory as the WASP executable or, for when you’re running WASP as a Windows Service, it can live in a place that is configured in the registry. The file is pretty simple and we’ve covered most of the options in the various tutorials but there are some configuration options that we haven’t touched on yet and it seems sensible to have one place to look for details of all of the options that you can configure in the config file.

WASP plugin entry points

By now you’ve probably taken a look inside of the WASP SDK header, WASPDLLEntryPoints.h and seen all of the various plugin entry points that you can export from your plugin. This tutorial will explain what each of them is for and how you use them and will present a simple plugin which uses all of the entry points and logs its actions to WASP’s debug log. As you’ve seen from the previous tutorials, a WASP plugin can be either a message framing DLL or a message handling DLL or both depending on the entry points that it exports.

WASP Server instances

A single WASP plugin can be loaded by multiple end points to provide the same server on multiple ports. A plugin could, for example, be configured on one end point to provide services to the internal network and on another end point to provide services to the internet. Alternatively, in later WASP releases, a single plugin may be used to provide services over an insecure link on one end point and via an SSL protected link on another.

Testing complex server code

As I mentioned in the release notes for v6.3 here, I’ve added some code to prevent potential recursion issues if certain performance improvements are enabled. In Windows Vista and later it’s possible to set the FILE_SKIP_COMPLETION_PORT_ON_SUCCESS flag on a socket using SetFileCompletionNotificationModes(). When this flag is set an overlapped operation can complete “in-line” and the completion operation can be handled on the thread that issued the operation rather than on one of the threads that is servicing the IO completion port that is associated with the socket.

Calling WASP functions from your plugin DLL.

So far our simple example WASP plugins have all used OnReadCompletedEx() which gives you both an input and an output buffer and assumes that you generate a single response to each inbound message. It also assumes that you wont write more data than will fit in a single I/O buffer. Whilst this is suitable for some server designs it’s quite restrictive. Most plugins will probably use a combination of OnReadCompleted() and the WASP callback function writeToConnection().

Some thoughts on that two thread pool server design

I’m currently re-reading “High Performance Server Architecture” by Jeff Darcy and he has a lot of sensible stuff to say about avoiding context switches and how my multiple thread pool design, whilst conceptually good is practically not so good. In general I agree with him but often the design provides good enough performance and it’s easy to compose from the various classes in The Server Framework. Explicitly managing the threads that could run, using a semaphore that only allows a number of threads that is equal to or less than your number of cores to do work at once is a nice idea but one that adds complexity to the workflow as you need to explicitly acquire and release the semaphore as you perform your blocking operations.