• Frugal Cafe
  • Posts
  • FC60: FluentValidation, over engineered, not recommended

FC60: FluentValidation, over engineered, not recommended

If you use it in high volume data processing, this will be your bottleneck

Someone recommended FluentValidation on LinkedIn, so I spent some time looking into its performance. Here is a test case from its Github readme file:

Simple performance test:

Test result:

Time to validate a single customer record is 2.2 microsecond, or 2.2 second to process 1 million customers. Allocation per validation call is 4,168 bytes. This is super expensive validation. If you use it in high volume data processing, this will become your performance bottleneck.

Here from allocation trace:

We have dictionary, delegates not reused, string, Lazy, closure objects, regular expression matching with replacement, message formatter, StringBuilder, enumerator, etc. It looks like over engineered by a big factor.

Here is the most expensive part:

It’s even using async implementation with cancellation token.

Here is regular expression usage:

Here is the regular expression:

This is a compiled regular expression. Not many people are aware of the fact that compiled regular expression is a hidden performance sink hole when under heavy multiple threaded usage. The running of compiled regular expression needs a runner object which is singleton and not thread-safe. To run it, implementation needs to rent the runner object, use it for a while, then return it. When another thread wants the same runner object, new instance needs to be allocated and this is very expensive. So heavily used compiled regular expression could be a huge performance bottleneck. So right solution here is avoid using regular expression, write your own simple parser instead. As the message strings should be fixed, parsing result can be easily reused.

From what I’m seeing so far, there are lots of problems in its implementation because it’s overall over-engineered. To make it useful for high volume data processing, one needs to peer it like an onion, returning it to single/efficient implementation. The first thing I would remove is async implementation, at least it should be optional code path.

If you add your own validation method, it will be orders of magnitude faster:

The test case shown above is testing default constructor with two validation failures for surname and forename. If we assign values for them, here is the test result:

When nothing is wrong, each validation call is costing 1.29 microsecond and 2,264 bytes of heap allocation. That is even more absurd. Validation with no error should have zero allocation and almost free in CPU.

Here are the allocations:

The usage of Lazy is rather expensive, and completely unnecessary.

Notice there is even a char enumerator allocation at 2.8%. Here is the source:

Lots of people are eager to talk about how cool pattern matching is in the new C#, but never discuss how it’s implemented. Apparently, there are implemented using interface and class casting here. The LINQ usage here matches string usage, so string char enumerator is allocated.

Here is the Lazy allocation:

There is a reference to method argument, so delegate/closure object allocations are needed. It should be easy to get rid of.