• Frugal Cafe
  • Posts
  • FC09 Self-Inflicted DOS Attack: Dictionary corruption

FC09 Self-Inflicted DOS Attack: Dictionary corruption

Endless looping in dictionary operations

HashSet/Dictionary are very useful data structures and widely used for their O(1) lookup times. But they’re not thread-safe, so if you allow multiple threads to write to them at the same time, their internal data structure would be corrupted, causing data loss, data inconsistency, exceptions, or even endless looping.

When a loop is formed in its internal data structure, any thread touching it could not return, so wasting one CPU core. More threads could join the party, resulting in the worst case of taking all the CPU cores on the machine, making the process and machine totally useless. This is what I call Self-Inflicted DOS attack. In a huge organization, you could find such issue almost every day, or at least every week.

Here is a simple repo to generate dictionary corruption:

Here we’re running the test case in a huge loop. For each test run, the code is running Parallel.ForEach on a stack with 128 elements. As the stack does not implement IList interface, Parallel.ForEach would queue large number of tasks into the thread pool. Each task here is updating the dictionary, and then catch exceptions, for logging.

Here is one exception caught:

This is in the final enumeration phase after the dictionary is corrupted.

Here is another exception:

This one is in dictionary resizing after the dictionary is corrupted already.

When you see no stop high CPU usage, capture an ETW trace. This is what you could see:

13 threads in endless looping in Dictionary.Insert implementation. Here is the source code annotated by hit counts (CPU samples):

The next field in the entries array is forming a loop. Newer version of .Net Core implementation would throw an exception in such case.

Before killing the process, you could take a dump to find the dictionary to verify the corruption:

The next field of the first entry is 0, thus looping back to itself. That is enough to cause endless looping, burning all your CPU cores.

The obvious solution to such problem is replacing Dictionary with ConcurrentDictionary. This is perfectly fine for things like singleton cache, or classes with limited number of instances. If instance count is too high, be mindful of increased memory usage.

Exactly the same thing would happen with HashSet. You can replace it with ConcurrentDictionary.