Ravi Honnavalli: August 2011

16.8.11

Concurrency & scalability

If you are working on a concurrent application you will have to mind one very important non-functional requirement which is scalability.

If you are writing a multi threaded application where you handle one job per thread, for example a server application that spawns a thread per connection, then at a point when the server is handling a few thousand connections, your OS will be spending a signification portion of time in context switching rather than precessing the connections. This causes a plateauing of performance after a few thousand concurrent threads. If you want you application to be scalable and concurrent this article is a good read: C10K problem

If your application is simple, in the sense you have a very little processing on each connection between you requests and responses you can choose between a simple reactor vs proactor

If your application does a lot of processing on each connection and is going to handle thousands of such connections you can go in for for a proactor infrastructure on top of which your connection handling logic goes into a finite state machine like the Boost Meta State machine. For all potential slow activities like disk read etc, you have Boost ASIO for asyncronous read and write from within your state machine as you can not make blocking calls in it. Boost MSM documentation has a lot of info about how a state machine works.

But, if your requirement is not scalability and your thread usage is not going to increase proportional to the number of connections or if you do not intend to handle too many connections, the a reactor/proactor/FSM may not help because this many not hold to much of advantage, but debugging them can be difficult and are not as straight forward as multi threaded programs.

14.8.11

Exception guarantees as a design time contract

Reading through GOTW challenges 8 and 59 on exception safety, Tom Cargill's article on exception safety and Herb Sutter's exception safe stack solution to this, etc, the conclusion one comes to is that exceptions in C++ necessitate a level of extra caution not only during programming but also during design.

The following challenge on GOTW demonstrates different hidden control flows you can stumble upon because of exceptions and so the need to watch every implicit function call with a degree of caution. For example a simple assignment would call an operator= implicitly which might throw an exception. Also the order in which state variables are changed matters so as to make sure an exception does not render the object in an unstable state.

But the key thing to note is that basic guarantee, strong guarantee, no throw guarantee are levels of safety one has to decide on for an interface at design time. This is because it is a contractual agreement with the client code just like any other functional requirements.Not only that, exception safety can sometimes result in a compromise on performance guarantees. So, based on the situation it is necessary to decide on which way to go.

In a generic component it becomes more challenging because exception guarantee of the classes depends on exception guarantees of types on which this class is templatized on, which is not available at the time class is written.STL, for example provides basic guarantee in many places because a user code can wrap around this to provide stronger guarantees if needed. It also makes a few assumptions like the destructor of the type we templatise on, should not throw, if its assignment operator and copy constructor doesn't throw then a vector::erase will provide a strong guarantee, etc.

I also realised a question I had for a long time. If you wondered why std::stack and std::queue have a pop() function that doesn't return anything but rather have a front() or top() function to be used later. This is to provide exception safety again. If pop() had returned the popped out object and if the operator= of that object threw an exception then we would have lost that object for ever.

All boost APIs explicitly mention about their exception guarantees in the documentation.

13.8.11

Generic programming & OOAD

Template and OOAD can help design genericity in different ways. One can not replace the other.

The template way is to provide templatized data structures and templatized algorithms and have iterators that connect the two. Based on the category of iterator that algorithm decides at compile time as to which implementation of the algorithm is most efficient for this data structure and operates on it. It may also decide that a particular algorithm is not best suited for a datastructure.

For example std::sort accepts two random access iterators and sorts everything between them. If passed iterators to a range in a linked list you will get a complier error. The algorithm can also choose the most efficient implementation based on the iterator category. When copying one range to another using std::copy, if you passed a random access iterator, the entire chunk between them is copied with a native memmove or some such function. If passed a less powerful iterator, it iterate and copy the range as expected. It can vary based on the implementation in your STL library, but that's the general idea.

If you look at it, the algorithms are generic, they don't know which data structure they are operating on. On the other hand the each one of the data structures did not have to provide these algorithms as member functions. So, the commonality is lifted out as generic algorithms.

OOAD on the other hand is about using a generic interface in the user code and the implementation detail is pushed to run time. OOAD helps where you expect conformance to an interface. This can not be done with the generic programming techniques above. If you want to ensure a Thread object should have a run function in it, you will have to write a abstract base class with a pure virtual function called run() and based on the kind of thread the implementation detail can vary as late as run time.

As a general rule iterators and algorithms are used when youo have a collection of data to operate on and they can be present in different kids of datastructures. OOAD is a more structural concept which concentrates on how a class hierarchy is designed and what functionality a class must provide, etc. One of the drawbacks of OOAD is that sometimes you may run into a situation where you a derived class is a kind of a base class, but you do not want to provide the functionality that you are enforced to.

Ravi Honnavalli

Pages