- Frugal Cafe
- Posts
- FC01 The 2.5 Copies of Data Problem
FC01 The 2.5 Copies of Data Problem
Reduce/avoid LOH allocation in JSON serialization
Captured a trace for an unrelated performance investigation, noticed Lenovo processes are in the list. They’re written in .Net Framework? Opened one up and found this interesting allocation stacks with lots of LOH (large object allocation):
Allocation Stack
AddInHostForwardRequest is responsible for 29.5% allocations in the process. Without reading its source code, we can see it’s doing JSON serialization to string, then converting the string to byte array, most likely for sending out. Both the string and byte arrays are allocated in LOH.
Source code (as decoded by ILSpy):
This is what I call 2.5 copies of data problem. The first copy of the data is inside the StringBuilder used for serialization, StringBuilder.ToString generates the second copy, and the byte array is the half copy because it’s normally half the size (for mostly ASCII data in utf8 encoding).
To reduce number of data copies and improving performance, the right thing to do is to serialize to a memory stream with a StreamWriter using UTF8 encoding. In best case, we just need half copy of the data.
But we do not know the size of the data, so most likely the MemoryStream needs to grow, which generates extra copies. There are a few possible solutions: first is to reuse the MemoryStream, the second is to segmented memory stream. This is a solvable problem.
Original serialization code:
public static string ToJson<T>(this T pThis)
{
StringWriter stringWriter = new StringWriter();
new JsonSerializer().Serialize(new JsonTextWriter(stringWriter), pThis);
return stringWriter.ToString();
}
Here are the steps to fix the problem:
Change MakeRequest and MakeNativeRequest to accept Span, ArraySegment, or (byte[], length).
Reuse MemoryStream, a singleton instance should be enough, unless you’re on a busy server.
Add a helper method to serialize object to JSON format, but write to the MemoryStream.
Add proper logic to rent the MemoryStream, reset it, call the serialization code, then MakeRequest, then release MemoryStream. Make MemoryStream is not closed in the process.
Suggested fix: