- Frugal Cafe
- Posts
- FC52: Visual Studio's Handling of a 700 mb JSON File
FC52: Visual Studio's Handling of a 700 mb JSON File
With large UI, virtualization is the key
Looking for large performance data on my own machine, I captured trace for Visual Studio building RavenDB code base, loaded into PerfView, then converted it to speed scope JSON format. The conversion generated a 700 mb JSON file (not an efficient format), which seems to be in a single line (bad formatting).
Visual Studio offered to open it, so I waited a while for it to load while its memory usage is keep climbing up. I captured a Visual Studio dump, 25 gb with 20 gb managed heap. That should be fun.
Here is the biggest chunk of its memory usage:
71% of memory is held by Visual Studio text editor internal data structure, which is built on top of WPF SimpleTextLine. There are 59 million SimpleRun objects, holding 59 million integer arrays. Total object count here is 130 million.
This is inefficient data structure design. Notice most data is held in WPF data structure, which is for UI presentation. For efficient representation of large data set, you can’t reply on another library too much. WPF is never designed to handle large/complex UI. The original team who designed WPF was from the original IE team building web browsers. To build complex/large WPF UI, using virtualization is the key, which means converting from your own data to UI representation only when elements are visible.
SimpleRun object is 88 bytes each, quite large. Here is one of them:
Here is the integer array:
It seems to contain the same small integer: 0×860. These integers seem to be for character width. As text editor is almost always using monospace font, every character has the same width.
When you design your own data structure, it would be much smaller.
Here is object stat (generated from my modified version of PerfView):
Total object count 130 million. There are 88 huge char arrays with average size of 16 mb.
Input file size is 748,960,660 bytes, managed heap size is 22,046,668,268 bytes. So Visual Studio editor’s data structure needs 29.44 bytes per character (UTF8 encoding, should be all ASCII). That is quite high overhead.