If you’re a seasoned .NET library author, there is a good chance that you’ve had to write some component that is acting on arbitrary user-defined types. This includes serializers, structured loggers, mappers, deep cloners, validators, parsers, random value generators, equality comparers, and many more. Such components typically focus on the data exposed by user types (properties and fields for POCOs or elements for collection types). They are generic, but not in the sense of a typical generic method you might find in a library like, say, LINQ. Rather, these would need to reflect on the transitive structure of the specified type and extract a workable schema that corresponds to its declared shape.
Historically, such components were only possible in .NET using runtime reflection, however the recent advent of Native AOT as a significant alternative to the CLR has made that approach impractical, if not obsolete. Library authors looking to provide Native AOT support must author an equivalent source generator that replaces or complements runtime reflection with compile-time code generation. In practice this requires maintaining two separate implementations that
- Cover all facets of the type system and its emergent combinations,
- Remain mutually compatible for most use cases and
- Are evolved in tandem with new language features.
Anybody who has spent time on this space will know that this is a seriously challenging problem, with 100% correctness being exceedingly difficult to achieve, verify, or preserve. For example, authors need to think of concerns like:
- Identifying whether a type is an object, collection, enum, or something different.
- Resolving the properties or fields that form the data contract of an object.
- Identifying the appropriate construction strategy for a type, if available.
- Addressing inheritance concerns, including virtual properties and diamond ambiguity.
- Supporting special types like nullable structs, enums, tuples or multi-dimensional arrays.
- Handling recursive types like linked lists or trees.
- Handling accessibility modifiers or special keywords like
requiredorinit. - Recognizing nullable reference type annotations.
- Supporting non-generic collection types, if appropriate.
- Identifying potentially unsupported types like ref structs, delegates or pointers.
This is far from an exhaustive list, but it serves to highlight the fact that such components are expensive to build and maintain. It also means that most libraries reinvent the same schema extraction logic, all with small quirks and divergences in their semantics.
This article will try to show that all this boilerplate is not strictly necessary when building production quality, high performance libraries; it can in fact be delegated to generic programming frameworks that let library authors focus on implementing the core traversal algorithm that matters.
Practical Generic Programming
PolyType is a generic programming library for C# that is a direct spin-off from a similar library for F#. It exposes a new design pattern that lets users write functionally complete, high-performance libraries in just a few hundred lines of code. The built-in source generator ensures that any library built on top of PolyType gets Native AOT support out of the box.
The library is available to download on NuGet:
$ dotnet add package PolyType
which includes the core abstractions and the source generator. It can be used to generate shape metadata for user-defined types:
using PolyType;
[GenerateShape]
public partial record Person(string Name, int Age);
This augments Person with an explicit implementation of the IShapeable<Person> interface, which suffices to make Person usable with any library built on top of the PolyType core abstractions. You can test this out by installing the built-in example libraries:
$ dotnet add package PolyType.Examples
Which includes a few format-specific serializers targeting IShapeable<T> types:
using PolyType.Examples.JsonSerializer;
using PolyType.Examples.CborSerializer;
using PolyType.Examples.XmlSerializer;
Person person = new("Pete", 70);
JsonSerializerTS.Serialize(person); // {"Name":"Pete","Age":70}
XmlSerializer.Serialize(person); // <value><Name>Pete</Name><Age>70</Age></value>
CborSerializer.EncodeToHex(person); // A2644E616D656450657465634167651846
Because the metadata for Person is supplied by the source generator, the above is fully compatible with Native AOT.
Here is an example that generates and pretty-prints an infinite stream of random values for a given type:
using PolyType.Examples.RandomGenerator;
using PolyType.Examples.PrettyPrinter;
foreach (Person p in RandomGenerator.GenerateValues<Person>())
{
Console.WriteLine(PrettyPrinter.Print(p));
await Task.Delay(2000);
}
The examples project includes many other libraries, including a deep cloner, a structural equality comparer, an IConfiguration binder, a schema generator, and a validation library. See the samples folder for more in-depth demos using these components.
Case Study: Writing a JSON serializer
The examples project implements a JSON serializer built on top of the Utf8JsonWriter and Utf8JsonReader primitives provided by System.Text.Json. It provides basic serialization functionality, including contract customization via attribute declaration while supporting more types compared to STJ.
At the time of writing, the full implementation is just under 1400 lines of code of which 1000 is converter definitions and only 200 is PolyType specific code. Nevertheless, this serializer achieves performance that is nearly 2x faster compared to System.Text.Json’s built-in JsonSerializer, as can be seen in the following benchmarks:
| Method | Mean | Ratio | Allocated | Alloc Ratio |
|---|---|---|---|---|
| Serialize_StjReflection | 491.9 ns | 1.00 | 312 B | 1.00 |
| Serialize_StjSourceGen | 467.0 ns | 0.95 | 312 B | 1.00 |
| Serialize_StjSourceGen_FastPath | 227.2 ns | 0.46 | – | 0.00 |
| Serialize_PolyTypeReflection | 277.9 ns | 0.57 | – | 0.00 |
| Serialize_PolyTypeSourceGen | 273.6 ns | 0.56 | – | 0.00 |
| Method | Mean | Ratio | Allocated | Alloc Ratio |
|---|---|---|---|---|
| Deserialize_StjReflection | 1,593.0 ns | 1.00 | 1024 B | 1.00 |
| Deserialize_StjSourceGen | 1,530.3 ns | 0.96 | 1000 B | 0.98 |
| Deserialize_PolyTypeReflection | 773.1 ns | 0.49 | 440 B | 0.43 |
| Deserialize_PolyTypeSourceGen | 746.7 ns | 0.47 | 440 B | 0.43 |
This is primarily attributable to the zero-allocation nature of object graph traversal as well as the reduced degree of indirection and branching afforded by the converter folding approach used by PolyType. (Disclosure: I’m also the current maintainer of System.Text.Json at the .NET team).
Writing my first Generic Program
Now that we’ve gone through some of the examples showing the capabilities of the library, we will now switch our focus to writing a generic program ourselves targeting the core abstractions of PolyType. Let’s consider a simple method that counts all non-null strings present in an object graph. For example, given the input
Person graph = new("Alice", 42, ["Bob", null, "Eve"]);
record Person(string Name, int Age, string?[] Friends);
such a counting function should be returning 3. Following PolyType convention, this method would have the following signature:
public static int CountStrings<T>(T? graph)
where T : IShapeable<T>;
Let’s begin by looking at the signature of the IShapeable<T> interface:
public interface IShapeable<T>
{
static abstract ITypeShape<T> GetShape();
}
It defines a static abstract factory for ITypeShape<T> which is the core abstraction introduced by the library. In a nutshell, it denotes a reflection-like representation for a given .NET type enabling strongly typed traversal of its type graph. This is made possible through the use of generic visitors using a technique originally described in this paper.
A library targeting PolyType, then, is a component mapping ITypeShape<T> instances to programs acting on inputs of type T. For the purposes of our counter example, we will use it to generate delegates of type Func<T?, int>. We begin by creating a shape visitor class:
private sealed class CounterBuilder : ITypeShapeVisitor
{
public object? VisitObject<T>(
IObjectTypeShape<T> objectShape,
object? state = null)
{
if (typeof(T) == typeof(string))
{
// Type is string, return a delegate that counts it.
return new Func<string?, int>(
str => str is null ? 0 : 1);
}
// Type is not a string
return new Func<T?, int>(_ => 0);
}
}
We first implement VisitObject which as the name suggests handles object-like types. This is a catch-all kind that includes primitives, POCOs and structs but not collection types or enums. We start simple for now, handling string which is our base case but otherwise return a delegate that always returns a count of zero. We can now use the visitor to define our public API:
public static int CountStrings<T>(T? graph)
where T : IShapeable<T>
{
ITypeShape<T> shape = T.GetShape();
CounterBuilder visitor = new();
var counter = (Func<T?, int>)shape.Accept(visitor)!;
return counter(graph);
}
This gives our program basic structure, but we’re far from done yet. The current implementation will always return zero for non-string inputs because we’re not recursing through the type graph. To achieve this, we need to start traversing through properties. We can do this by updating the VisitObject method:
public object? VisitObject<T>(
IObjectTypeShape<T> objectShape,
object? state = null)
{
if (typeof(T) == typeof(string))
{
// Type is string, return a delegate that counts it.
return new Func<string?, int>(
str => str is null ? 0 : 1);
}
if (!objectShape.HasProperties)
{
// Type is trivial, produce a delegate that returns 0.
return new Func<T?, int>(_ => 0);
}
// Construct counters for each property getter.
Func<T, int>[] propertyCounters = objectShape.GetProperties()
.Where(property => property.HasGetter)
.Select(property => (Func<T, int>)property.Accept(this)!)
.ToArray();
// Fold into a single delegate that counts all properties.
return new Func<T?, int>(graph =>
{
if (graph is null)
{
return 0;
}
int count = 0;
foreach (var propCounter in propertyCounters)
{
count += propCounter(graph);
}
return count;
});
}
This updates the method so that counter delegates for each available property getter are recursively computed, then folds them all into a single delegate that aggregates the total count. We’re not done yet though, as we now need to implement the visitor for property shapes:
public object? VisitProperty<T, TProperty>(
IPropertyShape<T, TProperty> propertyShape,
object? state = null)
{
// Extract a getter delegate for the property.
Func<T, TProperty> getter = propertyShape.GetGetter();
// Recursively compute the counter for the property type.
var propertyTypeCounter = (Func<TProperty?, int>)
propertyShape.PropertyType.Accept(this)!;
// Fold into a single delegate that gets the
// property value and then counts it.
return new Func<T, int>(graph =>
propertyTypeCounter(getter(graph)));
}
This should conclude support for POCOs. Finally, let us turn our attention to collection types. We implement the VisitEnumerable method like so:
public object? VisitEnumerable<T, TElement>(
IEnumerableTypeShape<T, TElement> enumerableShape,
object? state = null)
{
// Extract a getter delegate for the enumerable.
Func<T, IEnumerable<TElement>> getEnumerable =
enumerableShape.GetGetEnumerable();
// Recursively compute the counter for the element type.
var elementCounter = (Func<TElement, int>)
enumerableShape.ElementType.Accept(this)!;
// Fold into a single delegate that counts all elements.
return new Func<T?, int>(enumerable =>
{
if (enumerable is null)
{
return 0;
}
int count = 0;
foreach (var element in getEnumerable(enumerable))
{
count += elementCounter(element);
}
return count;
});
}
And we’re done! The above suffices to define a strongly typed, zero-allocation implementation of a generic string counter that works with most .NET types. The full implementation which also adds support for dictionary types and enums is 153 lines of code and can be found in this gist.
To recap, PolyType exposes a set of reflection-like abstractions that offer a simplified and strongly typed view of the data contract surfaced by .NET types. Under this model, types are split into five separate kinds:
- Objects which contain zero or more properties.
- Enumerables that define an element type.
- Dictionaries that define a key type and a value type.
- Nullable<T> types.
- Enum types.
For more information on the PolyType abstractions and programming model, please refer to the Core Abstractions document in the project website.
I’d still prefer to write my own generator
Despite the benefits offered by the PolyType abstractions, it is often the case that using them might not be appropriate for your library. This could happen because:
- It needs to generate additional source code that goes beyond type graph traversal.
- It does not perform as fast as fully inlined generated methods.
- There are concerns around the static footprint introduced by the generated metadata.
For such use cases, the project additionally ships the PolyType.Roslyn NuGet package which exposes compile-time equivalents of the core abstractions. It can be used to map Roslyn type symbols to a simplified model of their data contract suitable for source generators performing data access. The built-in source generator itself relies on this library to extract its generated shape models.
Final Remarks
PolyType is a project that aspires to lower the barrier of entry for writing high performance, feature complete and AOT compatible .NET libraries. Even though it is still in pre-release, the current design is stable enough that a 1.0 release seems feasible within the next few months. If you’re a library author, I would encourage you to try it out against your components and share any feedback on your experience.
Beyond the narrow scope of this project though, my hope is that this and similar ideas (👋 Serde.NET) will help spark a wider conversation on the future of writing .NET libraries.
