Frugal Cafe
Posts
FC03 Most PerformanceCounter usages are wrong

FC03 Most PerformanceCounter usages are wrong

Single PerformanceCounter read gets complete category data.

Feng Yuan
August 04, 2023

Most usages of PerformanceCounter are inefficient, because developers do not really understand its implementation and cost.

Here is allocation stack found in a Lenovo process on a laptop:

21.6% of total allocations are LOH allocations, the biggest is byte arrays at 14.6%, all from reading performance counters.

The sources of them are three delegates registered in AddinEntry class. Let’s check the source code (as decoded by ILSpy):

This is the first delegate, it’s handling Process/% Processor Time, the next one is handling Process/Working Set - Private. They’re from the same category.

PerformanceCounter implementation is rather inefficient. The reason is that every read operation needs to read all performance counter instances from the same category. Most data are read, parsed, then discarded.

The proper implementation is not even use PerformanceCounter API. Instead, you should just use a single PerformanceCounterCategory (pre allocated), read all its data, then piece out the instance data you want.

On a large machine, there could be lots of processes, so lots of performance counter instances, lots of data (so the byte arrays used are in LOH). Even a single read could be expensive, so you should reduce the reading frequency, for example to something like once per 15 seconds.

Here is another issue here, there should be no need to use ConcurrentDictionary, because the write operations are from single thread.

One more issue here:

The code is using ConcurrentBag whose enumeration is causing string list generation. Both ConcurrentBag and Linq.Enumerable should be avoided. Source code:

TargetExecutables here is a ConcurrentBag. The proper data structure to use is ConcurrentDictionary with string as key, and StringComparer.OrdinalIgnoreCase comparer.