• Frugal Cafe
  • Posts
  • FC71: Replacing BinaryReader.ReadString

FC71: Replacing BinaryReader.ReadString

String interning from raw data

Now that we have a great string interner implementation, we can replace BinaryReader.ReadString which has the following performance issues:

  1. It always allocates new string if string is not empty.

  2. It copies data into char array before generating string.

  3. If the string is long enough (over 128 byte), StringBuilder is used to store char data.

  4. The StringBuilder is borrowed from StringBuilderCache which will throw away larger builder with over 360 characters.

It’s quite easy to replace BinaryReader.ReadString, you just need to add a new class deriving from BinaryReader:

Now we can reimplement ReadString:

The byte reading buffer is rented from array pool, no char buffer needed. If encoding is either Utf8 or Unicode, we use StringTable for interning directly from raw data. Here is InternUtf8 implementation:

For simplicity, only ASCII strings are supported, but they should be most common in heavy data processing.