A thorough analysis of the improvements to ref and struct in C# 11

Preface

A heavyweight feature is coming in C# 11 that can ecstasy for developers who value performance. This feature mainly revolves around an important underlying performance facilityrefandstructa series of improvements.

However, this part of the improvement involves a lot of content and may not be completed in .NET 7 (C# 11), so it is also possible to postpone some of the content to C# 12. Of course, it is still very hopeful that you can see the completeness at the time of C# 11.

This article only introduces this feature, because in addition to this feature, there are many other improvements in C# 11. There are so many articles that can’t be finished. Let’s wait until .NET 7 is officially released.

background

C# has introduced new versions since 7.0ref structUsed to represent objects on the stack that cannot be boxed, but at that time it was very limited and could not even be used for generic constraints or asstructfield. In C# 11, due to the characteristicsrefThe field push requires allowing types to hold references to other value types, and things in this area have finally made great progress.

These facilities are designed to allow developers to write high-performance code using secure code without facing unsafe pointers. Next I will introduce the upcoming improvements in C# 11 or even 12 in this regard.

ref field

C# used to be able to hold references to other value types in types, but in C# 11, this will become possible. Starting with C# 11, it will be allowedref structdefinitionrefField.

readonly ref struct Span<T>
{
    private readonly ref T _field;
    private readonly int _length;
    public Span(ref T value)
    {
        _field = ref value;
        _length = 1;
    }
}

Intuitively, such a feature will allow us to write the above code, which constructs aSpan<T>, it holds for othersTReference to the object.

certainly,ref structCan bedefaultTo initialize:

Span<int> span = default;

But this_fieldIt will be an empty reference, but we can passMethod to check:

if ((ref _field))
{
    throw new NullReferenceException(...);
}

in addition,refThe modifiability of fields is also a very important thing, so:

readonly ref: A read-only reference to an object, which cannot be constructed by itself orinitModified outside the method
ref readonly: A reference to a read-only object that cannot be modified outside the constructor or init method
readonly ref readonly: A read-only reference to a read-only object is a combination of the above two

For example:

ref struct Foo
{
    ref readonly int f1;
    readonly ref int f2;
    readonly ref readonly int f3;

    void Bar(int[] array)
    {
        f1 = ref array[0];  // no problem        f1 = array[0];      // Error, because the value referenced by f1 cannot be modified        f2 = ref array[0];  // Error, because f2 itself cannot be modified        f2 = array[0];      // no problem        f3 = ref array[0];  // Error: Because f3 itself cannot be modified        f3 = array[0];      // Error: Because the value referenced by f3 cannot be modified    }
}

life cycle

All of this looks beautiful, but is there really no problem?

Suppose we have the following code to use the above:

Span<int> Foo()
{
    int v = 42;
    return new Span<int>(ref v);
}

vIt is a local variable, and its life cycle will end after the function returns, so the above code will causeSpan<int>HoldvThe reference becomes invalid. By the way, the above code is completely legal because C# did not support it beforereffield, so the above code is impossible to have escape problems. But C# 11 joinedrefThe fields, objects on the stack may passrefReference escape occurs in fields, so the code becomes insecure.

If we have oneCreateSpanMethods are used to create a referenceSpan ：

Span<int> CreateSpan(ref int v)
{
     // ...
}

This leads to a series of problems in the previous C# (becauserefThe life cycle is the current method), but in C# 11 due to possible existencerefFields lead to non-secure code written in a safe way:

Span<int> Foo(int v)
{
    // 1
    return CreateSpan(ref v);
    // 2
    int local = 42;
    return CreateSpan(ref local);
    // 3
    Span<int> span = stackalloc int[42];
    return CreateSpan(ref span[0]);
}

Therefore, in C# 11, destructive changes have to be introduced, and the above code is not allowed to be compiled. But that doesn't completely solve the problem.

To solve the escape problem, C# 11 formulates rules for citing escape safety. For aeFields inf：

iffIt's areffield, andeyesthis,butfIn the method it is surrounded, it is referenced to escape security
Otherwise iffIt's areffield, thenfReference escape security scope andeThe escape safety range is the same
Otherwise ifeis a reference type,fThe reference escape safe range is the method that calls it
otherwisefReference escape security scope andesame
Since methods in C# can return references, according to the above rules,ref structThe method in the process will not return a non-refReferences to fields:

ref struct Foo
{
    private ref int _f1;
    private int f2;

    public ref int P1 =&gt; ref _f1; // no problem    public ref int P2 =&gt; ref _f2; // Error, because of a violation of rule 4}

In addition to citing escape security rules, there is alsorefRules for assignment:

forx.e1 = ref e2, inxIt is safe to escape in calling methods, thene2Must be referenced in the call method to escape safely
fore1 = ref e2,ine1It's a local variable, thene2The reference escape security range must be at least ande1The quotation escape security range is as large as

So, according to the above rules, the following code is fine:

readonly ref struct Span&lt;T&gt;
{
    readonly ref T _field;
    readonly int _length;

    public Span(ref T value)
    {
        // No problem, because x is this, the escape security range of this and the reference escape security range of value are both called methods, satisfying rule 1        _field = ref value;
        _length = 1;
    }
}

So it is natural that the life cycle needs to be marked on the fields and parameters to help the compiler determine the escape range of the object.

When we write code, we don’t need to remember so many of the above rules, because everything becomes explicit and intuitive after life cycle annotation.

scoped

In C# 11,scopedKeywords are used to limit the security range of escape:

Local variable s	Quote Escape Security Range	Escape security range
`Span<int> s`	Current method	Calling methods
`scoped Span<int> s`	Current method	Current method
`ref Span<int> s`	Calling methods	Calling methods
`scoped ref Span<int> s`	Current method	Calling methods
`ref scoped Span<int> s`	Current method	Current method
`scoped ref scoped Span<int> s`	Current method	Current method

in,scoped ref scopedIt is redundant because it can beref scopedImplicit. And we just need to knowscopedIt is used to limit the escape range to the current method. Isn't it very simple?

In this way, we can mark the escape range (life cycle) of the parameters:

Span<int> CreateSpan(scoped ref int v)
{
    // ...
}

Then, the previous code becomes fine because it is allscoped ref：

Span<int> Foo(int v)
{
    // 1
    return CreateSpan(ref v);

    // 2
    int local = 42;
    return CreateSpan(ref local);
    // 3
    Span<int> span = stackalloc int[42];
    return CreateSpan(ref span[0]);
}

scopedIt can also be used on local variables:

Span&lt;int&gt; Foo()
{
    // Error, because span cannot escape the current method    scoped Span&lt;int&gt; span1 = default;
    return span1;

    // No problem, because the initializer's escape safe range is the calling method, because span2 can escape to the calling method    Span&lt;int&gt; span2 = default;
    return span2;
    // span3 and span4 are the same, because the escape safe range of the initializer is the current method, and there is no difference between adding scoped or not    Span&lt;int&gt; span3 = stackalloc int[42];
    scoped Span&lt;int&gt; span4 = stackalloc int[42];
}

in addition,structofthisAdded tooscoped refThe escape range of , that is, the escape safe range is referenced as the current method, and the escape safe range is the calling method.

The rest is andout、inThe combination of parameters, in C# 11,outThe parameters will default toscoped ref,andinThe parameters remain defaultref：

ref int Foo(out int r)
{
    r = 42;
    return ref r; // Error, because the reference escape safe range of r is the current method}

This is very useful, for example, in the following common situation:

Span&lt;byte&gt; Read(Span&lt;byte&gt; buffer, out int read)
{
    // .. 
}

Span&lt;int&gt; Use()
    var buffer = new byte[256];
    // If the reference escape security range of out is not modified, this will report an error because the compiler needs to consider that read can be returned as a ref field    // If you modify the reference escape security scope of out, there is no problem, because the compiler does not need to consider that read can be returned as a ref field    int read;
    return Read(buffer, out read);

Here are some more examples:

Span&lt;int&gt; CreateWithoutCapture(scoped ref int value)
{
    // Error, because the reference escape safe range of value is the current method    return new Span&lt;int&gt;(ref value);
}

Span&lt;int&gt; CreateAndCapture(ref int value)
    // No problem, because the escape security range of value is limited to the reference escape security range of value, which is the calling method    return new Span&lt;int&gt;(ref value)
Span&lt;int&gt; ComplexScopedRefExample(scoped ref Span&lt;int&gt; span)
    // No problem, because the escape safe range of span is to call the method    return span;
    // No problem, because the reference escape security range of refLocal is the current method, and the escape security range is the calling method    // In the call to ComplexScopedRefExample it is passed to a scoped ref parameter,    // means that the compiler does not need to consider referenced escape security scope when calculating the life cycle, but only needs to consider escape security scope.    // Therefore, the safe escape range of the value it returns is to call the method    Span&lt;int&gt; local = default;
    ref Span&lt;int&gt; refLocal = ref local;
    return ComplexScopedRefExample(ref refLocal);
    // Error, because stackLocal's reference escape security range and escape security range are both current methods    // Therefore the safe escape range of the value it returns is the current method    Span&lt;int&gt; stackLocal = stackalloc int[42];
    return ComplexScopedRefExample(ref stackLocal);

unscoped

In the above design, there is still a problem that has not been solved:

struct S
{
    int _field;

    // Error, because this reference escape safe range is the current method    public ref int Prop =&gt; ref _field;
}

Therefore, introduce aunscoped, allowing the escape range to be extended to the calling method, so the above method can be rewritten as:

struct S
{
    private int _field;
    // No problem, the reference escape security scope has been extended to the calling method    public unscoped ref int Prop =&gt; ref _field;
}

thisunscopedCan also be placed directlystructsuperior:

unscoped struct S
{
    private int _field;
    public unscoped ref int Prop => ref _field;
}

Similarly, nestedstructNo problem either:

unscoped struct Child
{
    int _value;
    public ref int Value => ref _value;
}

unscoped struct Container
{
    Child _child;
    public ref int Value => ref _child.Value;
}

Also, if needed to restore the previous oneoutIf you escape, you can alsooutSpecified on the parametersunscoped：

ref int Foo(unscoped out int r)
{
    r = 42;
    return ref r;
}

But it's relatedunscopedThe design is still in the preliminary stage and will not be provided in C# 11.

ref struct constraint

Starting with C# 11,ref structIt can be used as a generic constraint, so you can write the following method:

void Foo<T>(T v) where T : ref struct
{
    // ...
}

therefore,Span<T>The functionality of theSpan<Span<T>>For example, use it inbyteorcharIt can be used for high-performance string processing.

reflection

With so many things above, reflection naturally needs to be supported. Therefore, the reflection API is also addedref structRelated support.

Practical use cases

With the above infrastructure, we can use security code to build some high-performance wheels.

Fixed length list on stack

struct FrugalList<T>
{
    private T _item0;
    private T _item1;
    private T _item2;

    public readonly int Count = 3;
    public unscoped ref T this[int index] => index switch
    {
        0 => ref _item1,
        1 => ref _item2,
        2 => ref _item3,
        _ => throw new OutOfRangeException("Out of range.")
    };
}

Stack-on table

ref struct StackLinkedListNode<T>
{
    private T _value;
    private ref StackLinkedListNode<T> _next;

    public T Value => _value;
    public bool HasNext => !(ref _next);
    public ref StackLinkedListNode<T> Next => HasNext ? ref _next : throw new InvalidOperationException("No next node.");
    public StackLinkedListNode(T value)
    {
        this = default;
        _value = value;
    }
    public StackLinkedListNode(T value, ref StackLinkedListNode<T> next)
        _next = ref next;
}

In addition to these two examples, others are like parsers and serializers, etc.Utf8JsonReader、Utf8JsonWriterThese things can be used.

Future plans

Advanced Life Cycle

Although the above life cycle design can meet most uses, it is still not flexible enough. Therefore, it is possible to expand on this basis in the future and introduce advanced life cycle annotations. For example:

void M(scoped<'a> ref MyStruct s, scoped<'b> Span<int> span) where 'b >= 'a
{
     = span;
}

The above method gives parameterssandspanTwo life cycles were declared separately'aand'band constrain'bThe life cycle is no less than'a, so in this method,spanCan be assigned safely to。

Although this will not be included in C# 11, if developers' demand for related things increases in the future, it may be subsequently added to C#.

Summarize

The above is C# 11 (or afterwards)refandstructImproved. With these infrastructures, developers will be able to easily write high-performance code without any heap memory overhead in a secure way. Although these improvements can only directly impose a small number of developers who pay great attention to performance, these improvements will bring about overall improvements in the quality and performance of the subsequent basic library code.

If you are worried that this will increase the complexity of the language, it is not necessary, because most people will not use these things, and will only affect a small number of developers. Therefore, for most people, they just need to write the same code and enjoy other basic library authors writing what they use the above facilities.

This is the end of this article about C# 11’s improvements to ref and struct. For more related C# 11’s improvements to ref and struct, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!