Detailed explanation of the implementation principles and use of generics in Go language

Preface

Original text: A gentle introduction to generics in Go by Dominik Braun

Wan Junfeng Kevin: After reading it, I thought the article was very simple and easy to understand, so I asked the author's consent and translated it and shared it with you.

This article is an easy-to-understand introduction to the basic ideas of generics and their implementation in Go, and is also a simple summary of various performance discussions around generics. First, let’s take a look at the core problems solved by generics.

question

Suppose we want to implement a simpletreeData structure. Each node holds a value. Prior to Go 1.18, a typical method of implementing this structure is as follows.

type Node struct {
    value interface{}
}

This works well in most cases, but it also has some downsides.

first,interface{}It can be anything. If we want to limitvaluePossible types, such as integers and floating-point numbers, we can only check this limit at runtime.

func (n Node) IsValid() bool {
    switch .(type) {
        case int, float32, float64:
            return true
        default:
            return false
    }
}

This is not possible to limit types at compile time. Type judgments like the above are very common in many Go libraries. Here are examples from the go-zero project.

Second, processing values in Node is very cumbersome and error-prone. Doing anything with a value involves some type of assertion, even if you can safely assume that the value holds oneintvalue.

number, ok := .(int)
if !ok {
    // ...
}

double := number * 2

These are just usedinterface{}Some of the inconveniences that it does not provide type safety and may cause difficult to recover runtime errors.

Solution

We do not intend to accept any data type or specific type, but define aTthe placeholder type as the type of value. Please note that this code will not be compiled yet.

type Node[T] struct {
    value T
}

First, you need to declare the generic typeT, this is used in square brackets after the structure or function name.

TCan be any type only if instantiating aNodehour,TIt will be deduced to this type.

n := Node[int]{
    value: 5,
}

GenericsNodeInstantiated asNode[int](Integer Node), soTIt's oneint。

Type constraints

In the above implementation,TThe declaration lacks one necessary information: type constraints.

Type constraints for further limitations can be used asTpossible types of . Go itself provides some predefined type constraints, but custom type constraints can also be used.

type Node[T any] struct {
    value T
}

Any type (any) constraints allowTActually any type. If the node values need to be compared, there is acomparableType constraint, types that satisfy this predefined constraint can be used==Make a comparison.

type Node[T comparable] struct {
    value T
}

Any type can be used as a type constraint. Go 1.18 introduces a newinterfaceSyntax, can be embedded in other data types.

type Numeric interface {
    int | float32 | float64
}

This means that an interface can define not only a set of methods, but also a set of types. useNumericThe interface is a type constraint, meaning that the value can be an integer or a floating point number.

type Node[T Numeric] struct {
    value T
}

Retrieve type safety

Relative to useinterface{}, the huge advantage of generic type parameters is thatTThe final type of the ,being derived at compile time. forTDefine a type constraint that completely eliminates runtime checks. If used asTIf the type of the code does not satisfy the type constraint, the code will not be compiled and passed.

When writing generic code, you can already knowTThe final type is the same as the code.

func (n Node[T]) Value() T {
    return 
}

The above function returns, its type isT. Therefore, the return value isT,ifTis an integer, then the return type is knownint. Therefore, the return value can be used directly as an integer without any type assertion.

n := Node[int]{
    value: 5,
}

double := () * 2

Recovering type safety at compile time makes Go code more reliable and less prone to errors.

Generic usage scenarios

existIan Lance TaylorThe typical usage scenarios of generics are listed in When To Use Generics, which boil down to three main situations:

Use built-in container types, such asslices、mapsandchannels
Implement common data structures, such aslinked listortree
Write a function whose implementation is the same for many types, such as a sorting function

Generally speaking, when you don't want to make assumptions about the content of the value you are operating on, you can consider using generics. In our exampleNodeNot too concerned about the value it holds.

Generics are not a good choice when different types have different implementations. Also, don'tRead(r )This interface function signature is changed toRead[T ](r T)Such a universal signature.

performance

To understand the performance of generics and their implementation in Go, you first need to understand the two most common ways to implement generics in general.

This is a brief introduction to the in-depth study of various performances and the discussions surrounding them. You are most likely not very concerned about the performance of generics in Go.

Virtual method table

One way to implement generics in a compiler is to useVirtual Method Table. Generic functions are modified to accept only pointers as arguments. These values are then allocated to the heap, and pointers to these values are passed to the generic function. This is done because the pointer always looks the same, regardless of the type it points to.

If these values are objects and generic functions need to call methods of these objects, it can't do this anymore. The function has only one pointer to the object and does not know where their method is. Therefore, it requires a table that can query the memory address of the method:Virtual Method Table. This so-called dynamic scheduling has been used by interfaces in languages such as Go and Java.

Virtual Method TableIt can be used not only to implement generics, but also to implement other types of polymorphisms. However, deriving these pointers and calling virtual functions is slower than directly calling functions, and usesVirtual Method TableWill prevent the compiler from optimizing.

Singletonization

An easier way is to singletonize (Monomorphization), the compiler generates a copy of the generic function for each called data type.

func max[T Numeric](a, b T) T {
    // ...
}

larger := max(3, 5)

Since the max function shown above is called with two integers, the compiler will beintGenerate amaxcopy of .

func maxInt(a, b int) int {
    // ...
}

larger := maxInt(3, 5)

The biggest advantage is thatMonomorphizationThe runtime performance is significantly better than that of useVirtual Method Table. Direct method calls are not only more efficient, but also can apply to the entire compiler optimization chain. However, the cost of this is compile time, and it is time-consuming to generate copies of generic functions for all related types.

Go implementation

Which of these two methods is most suitable for Go? Quick compilation is important, but runtime performance is also important. To meet these requirements, the Go team decided to mix two approaches when implementing generics.

GoMonomorphization, but attempts to reduce the number of copies of functions that need to be generated. Instead of creating a copy for each type, it generates a copy for each layout in memory:int、float64、Nodeand other so-called"Value Type"It looks different in memory, so a generic function will copy a copy for all these types.

Contrary to value types, pointers and interfaces always have the same layout in memory. The compiler will generate a copy of the generic function for the call to the pointer and interface. Just likeVirtual Method TableLikewise, generic functions receive pointers, so a table is needed to dynamically find method addresses. The performance characteristics of dictionaries and virtual method tables in Go implementations are the same.

in conclusion

The advantage of this mixed method is that you get it in the call using the value typeMonomorphizationperformance advantages, but only pay for calls using pointers or interfacesVirtual Method Tablecost.

What is often overlooked in performance discussions is that all these benefits and costs only involve the calls of functions. Normally, most of the execution time is used inside the function. The performance overhead of calling methods may not become a performance bottleneck. Even so, we must consider optimizing the function implementation first and then considering the call overhead.

This is the article about detailed explanation of the implementation principles and usage of generics in Go. For more related content on Go, please search for my previous articles or continue browsing the related articles below. I hope you will support me in the future!