The test case I wish I had written

The test case I wish I had written

Posted on Posted in blog, technical

Yesterday, while testing our new release candidate, we found a bug where the orchestrator would become very slow. The conditions under which the bug occurred were complicated, hard to reproduce and generally unpleasant to work with.

This bug had us worried, because this kind of performance regression, this late in the release cycle, is not something we take lightly.

Our product is an orchestrator. A tool to coordinate actions over a distributed system. Distributed systems can be hard to work with, because different things happen in different places, so not all information is in one place. Also, not all parts move equally fast, there is a lot of indeterminisms you can’t avoid. Basically, everything that can make programming hard is in there. So this had us worried a bit more.

Issue

When diving into the problem, we found out that some things were happening that should not be happening at all! Whenever the orchestrator deploys something, it optimizes the deployment plan. It sorts deployment steps into subsets: things that are already done, things to do, and things that can’t be done.

It turned out that some things were marked as both `to do` and `already done`. This led to a whole bunch of delicate and complicated effects, leading to a serious performance regression (under very specific circumstances and with some race conditions involved).

The core of the issue was the code that optimized the deployment plan. We knew this code was very important when we made it, so it has a lot of tests to ensure it produces the right output. It even has its own little test framework.

But these tests forgot to test one very trivial thing: the sets of things to do and things already done have nothing in common!

Solution

By adding this assertion to the test cases, we could have prevented this whole last-minute scare.

assert set(pos).isdisjoint(set(neg))

So today I learned (once more): assert your invariants, even if (or particularly when) they are trivial.

(I don’t know if professor Steegmans, who taught me this around 2006, would be happy I still remember his teachings or disappointed it took me until today to fully appreciate his teachings.)

Link to the code