Understand the reasons why go reflect slow reflection

One important reason why we chose the go language is that it has very high performance. However, its reflective performance has always been criticized. In this article, let’s take a look at the performance problems of go reflection.

Performance testing of go

Before you start, it is necessary to understand the performance tests of go first. It is very simple to perform performance testing in go, you just need to add the test function before it.BenchmarkPrefix, then use it in the function bodyTo perform the loop, you can get the time spent on each loop. As in the following example:

func BenchmarkNew(b *) {
   ()
   for i := 0; i < ; i++ {
      New()
   }
}

We can use commandsgo test -bench=. reflect_test.goTo run this test function, or if you use goland, just click the run button.

illustrate:

exist*_test.goIn the fileBenchmark*The prefix function is a performance test function, and its parameters are*type.
(): Report the number of memory allocations, this is a very important indicator becauseMemory allocation is a time-consuming operation compared to simple CPU calculations.. In performance testing, we need to pay attention to the number of memory allocations and the size of each memory allocation.
: It is a number of loops, and each loop will be executedNew()function, and then record the time taken for each loop.

Many optimizations in go are committed to reducing memory allocation, and in many cases, they can improve performance.

Output:

BenchmarkNew-20 1000000000 0.1286 ns/op 0 B/op 0 allocs/op

Output description:

BenchmarkNew-20：BenchmarkNewIt is the test function name.-20It is the number of CPU cores.
1000000000: Number of cycles.
0.1286 ns/op: The time spent in each cycle is in nanoseconds. This means that each cycle takes 0.1286 nanoseconds.
0 B/op: The size of memory allocation for each loop, in bytes. This means that no memory is allocated for each loop.
0 allocs/op: The number of times the memory allocated in each loop. This means that no memory is allocated for each loop.

The reason for slow reflection

The flexibility of dynamic languages comes at the expense of performance, and go language is no exception. Go's interface{} provides a certain degree of flexibility, but there is some performance loss when dealing with interface{}.

We all know that go is a static language, which means we know all types when we compile, not when we only know them at runtime. But there is one in gointerface{}Type, it can represent any type, which means we can only know the type at runtime. But essentially,interface{}Types are still static, but their types and values are dynamic. existinterface{}In the type, two pointers are stored, one pointing to type information and the other pointing to value information. For details, please refer to"go interface design and implementation"。

Flexibility brought by go interface{}

Have itinterface{}Types allow go to have the characteristics of dynamic languages, for example, to define a function whose parameters areinterface{}type, then we can pass in any type of value to this function. For example, the following function (does add any integer, returnint64type):

func convert(i interface{}) int64 {
   typ := (i)
   switch () {
   case :
      return int64(i.(int))
   case reflect.Int8:
      return int64(i.(int8))
   case reflect.Int16:
      return int64(i.(int16))
   case reflect.Int32:
      return int64(i.(int32))
   case reflect.Int64:
      return i.(int64)
   default:
      panic("not support")
   }
}

func add(a, b interface{}) int64 {
   return convert(a) + convert(b)
}

illustrate:

convert()Function:interface{}Type conversion toint64type. For non-integer types, panic. (Of course not very rigorous, not covered yetuint*type)
add()Function: Do addition of any integer, returnint64type.

In contrast, if it is a certain type, we don’t need to judge the type at all, just add it directly:

func add1(a, b int64) int64 {
   return a + b
}

We can compare it with the following benchmark:

func BenchmarkAdd(b *) {
   ()
   for i := 0; i < ; i++ {
      add(1, 2)
   }
}

func BenchmarkAdd1(b *) {
   ()
   for i := 0; i < ; i++ {
      add1(1, 2)
   }
}

result:

BenchmarkAdd-12 179697526 6.667 ns/op 0 B/op 0 allocs/op
BenchmarkAdd1-12 1000000000 0.2353 ns/op 0 B/op 0 allocs/op

We can see a very obvious performance gap.add()To compareadd1()It's much slower, and this is just a case of making some simple type judgments and type conversions.

go The cost of flexibility (slow reason)

Through this example we know that go thoughinterface{}It provides us with some flexibility, but using this dynamic feature comes at a cost, such as:

We only know the type at runtime, so we need to make type judgment at runtime (that is, throughreflection), this judgment will have some overhead (it was originally a certain type, but now it may have to match in more than 20 types to determine what its type is). At the same time, after judging that it belongs to a certain type, it often needs to be converted to a specific type, which is also an overhead.
At the same time, we may need to do some properties and methods searches (Field, FieldByName, Method, MethodByName), these operations are done at runtime, so there will be certain performance losses.
In addition, when searching for properties, methods, the search performance depends on the number of properties and methods. If the number of properties and methods is large, the search performance will be relatively slow. By index (Field, Method) Find compared to name (FieldByName, MethodByName) Lookup is much faster, the latter has memory allocation operations
When we do these operations through reflection, there are many more operations, such as two simple onesintType addition, it could have been directly added. But through reflection, we have to first followinterface{}Create a reflection object, then make type judgment, type conversion, and finally add.

In general, although the interface{} type of go provides us with some flexibility, allowing developers to implement some dynamic language features in go,However, this flexibility comes at the expense of certain performance. It will make some simple operations complicated. On the one hand, the generated compilation instructions will be dozens of times more, and on the other hand, it is possible that memory allocation will occur in this process (for example,FieldByName）。

Slow is relative

From the example above, we found that the reflection of go seems to be so slow that it is unbearable, and then someone proposed some solutions, such as:Avoid running-time reflection operations through code generation, thereby improving performance. For example, easyjson

But such solutions will make the code complicated. We need to weigh the trade-offs before making a decision. Why? Because although the reflection is slow, we need to know that if there are network calls in our application,The time for any network call is often not less than 1ms, and this 1ms is enough to do many reflection operations.. What inspiration does this give us? If we are not doing middleware or some high-performance services, but doing some web applications, then we can consider whether the performance bottleneck is reflecting. If so, then we can consider the way of code generation to improve performance. If not, then do we really need to sacrifice the maintainability and readability of the code to improve the performance of reflection? Is the benefits of optimizing several slow queries higher?

go Reflection performance optimization

if you can,The best optimization is to not use reflection。

Avoid reflection operations during serialization and deserialization through code generation

HereeasyjsonFor example, let's take a look at how it is done. Suppose we have the following structure, which we need to json serialize/deserialize:

// 
type Person struct {
   Name string `json:"name"`
   Age  int    `json:"age"`
}

useeasyjsonIn other words, we need to generate code for the structure, here we useeasyjsoncommand line tool to generate code:

easyjson -all

In this way, we will generate it in the current directoryperson_easyjson.goThe file containsMarshalJSONandUnmarshalJSONMethods, these two methods are the serialization and deserialization methods we need. Different from the standard libraryand, These two methods do not require reflection, and their performance will be much better than the standard library methods.

func easyjsonDb0593a3EncodeGithubComGinGonicGinCEasy(out *, in Person) {
   ('{')
   first := true
   _ = first
   {
      const prefix string = ","name":"
      (prefix[1:])
      (string())
   }
   {
      const prefix string = ","age":"
      (prefix)
      (int())
   }
   ('}')
}

// MarshalJSON supports  interface
func (v Person) MarshalJSON() ([]byte, error) {
   w := {}
   easyjsonDb0593a3EncodeGithubComGinGonicGinCEasy(&w, v)
   return (), 
}

We see, we arePersonThe serialization operation now only takes a few lines of code to complete, but it also has obvious disadvantages, and there will be a lot of code generated.

Performance Gap:

goos: darwin
goarch: amd64
pkg: /gin-gonic/gin/c/easy
cpu: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
BenchmarkJson
BenchmarkJson-12 3680560 305.9 ns/op 152 B/op 2 allocs/op
BenchmarkEasyJson
BenchmarkEasyJson-12 16834758 71.37 ns/op 128 B/op 1 allocs/op

We can see thateasyjsonThe serialization performance of generated code is much better than that of the standard library method, which is more than 4 times better.

Reflection result cache

This method is suitable for scenarios where you need to find structure fields or methods based on your name.

Suppose we have a structurePerson, there are 5 methods,M1、M2、M3、M4、M5, we need to find the method in it by name, then we can usereflectPackage to implement:

p := &Person{}
v := (p)
("M4")

This is an easy way to think of, but how is the performance? Through performance testing, we can see that the performance in this method is very poor:

func BenchmarkMethodByName(b *) {
   p := &Person{}
   v := (p)

   ()
   for i := 0; i < ; i++ {
      ("M4")
   }
}

result:

BenchmarkMethodByName-12 5051679 237.1 ns/op 120 B/op 3 allocs/op

In contrast, if we use indexes to get the methods in it, the performance will be much better:

func BenchmarkMethod(b *) {
   p := &Person{}
   v := (p)

   ()
   for i := 0; i < ; i++ {
      (3)
   }
}

result:

BenchmarkMethod-12 200091475 5.958 ns/op 0 B/op 0 allocs/op

We can see that the two performances differ by dozens of times. So can we passMethodMethod to replaceMethodByNameSo as to obtain better performance? The answer is yes, we can cacheMethodByNameThe result (that is, the subscript corresponding to the method name) is obtained directly through this subscript when obtaining the corresponding method by reflection:

Here we need to use MethodByName to get the reflected method object.

// Method subscript corresponding to cache method namevar indexCache = make(map[string]int)

func methodIndex(p interface{}, method string) int {
   if _, ok := indexCache[method]; !ok {
      m, ok := (p).MethodByName(method)
      if !ok {
         panic("method not found!")
      }

      indexCache[method] = 
   }

   return indexCache[method]
}

Performance Test:

func BenchmarkMethodByNameCache(b *) {
   p := &Person{}
   v := (p)

   ()
   var idx int
   for i := 0; i < ; i++ {
      idx = methodIndex(p, "M4")
      (idx)
   }
}

result:

// It is nearly 20 times faster than the original MethodByNameBenchmarkMethodByNameCache-12           86208202                13.65 ns/op            0 B/op          0 allocs/op
BenchmarkMethodByName-12                 5082429               235.9 ns/op           120 B/op          3 allocs/op

Similar to this example is the Field/FieldByName method, which can adopt the same optimization method. This may be a more common operation, deserialization may require searching the field by the field name and then assigning values.

Use type assertions instead of reflection

In actual use, if you only need to make some simple type judgments, such as determining whether to implement a certain interface, you can use type assertions to implement:

type Talk interface {
   Say()
}

type person struct {
}

func (p person) Say() {
}

func BenchmarkReflectCall(b *) {
   p := person{}
   v := (p)

   for i := 0; i < ; i++ {
      idx := methodIndex(&p, "Say")
      (idx).Call(nil)
   }
}

func BenchmarkAssert(b *) {
   p := person{}

   for i := 0; i < ; i++ {
      var inter interface{} = p
      if v, ok := inter.(Talk); ok {
         ()
      }
   }
}

result:

goos: darwin
goarch: amd64
cpu: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
BenchmarkReflectCall-12 6906339 173.1 ns/op
BenchmarkAssert-12 171741784 6.922 ns/op

In this example, even if we use the cached version of reflection, the performance is nearly 25 times worse than the type assertion.

So before we use reflection, we need to think about whether it can be implemented through type assertions, and if so, then there is no need to use reflection.

Summarize

go provides performance testing tools that we can passgo test -bench=.This command is used to perform performance testing. After running the command, the test file in the folderBenchmark*The function will be executed.

In the results of performance tests, in addition to the average execution time, there are also the number of memory allocations and the number of bytes allocated by memory, which are indicators that we need to pay attention to. The number of memory allocations and the number of bytes allocated memory can be passed()to conduct statistics. The fewer times and the number of bytes allocated by memory, the better the performance.

Although the reflection is slow, it also brings some flexibility. Its slowness is mainly caused by the following reasons:

Type judgment is required at runtime. Compared with determined types, runtime may need to make judgments among more than 20 types.
After type judgment, interface{} is often needed to convert it to a specific type, and this conversion also takes a certain amount of time.
The search for methods and fields also takes a certain amount of time. EspeciallyFieldByName, MethodByNameIn this method, they need to traverse all fields and methods and then compare them. The comparison process also takes a certain amount of time. Moreover, this process also requires allocating memory, which will further reduce performance.

Whether or not slow is a relative concept. If our application is waiting for IO most of the time, then the performance of reflection will most likely not become a bottleneck. Optimizing elsewhere may bring greater benefits, and can also use some reflection methods with lower spatiotemporal complexity without affecting the maintainability of the code, such as usingFieldreplaceFieldByNamewait.

If possible, try not to use reflection as much as possible.

Some performance optimization methods for reflection are as follows (not completely, and optimization needs to be done according to actual conditions):

Use the method of generating code to generate specific serialization and deserialization methods, so that the overhead of reflection can be avoided.
Cache the results obtained by the first reflection, so that if reflection is needed in the future, the cached results can be used directly to avoid the overhead of reflection. (Change time in space）
If you just need to make a simple type judgment, you can first consider whether type assertion can achieve the effect we want, which is much less expensive than reflection.

Reflection is a huge topic. Here we just briefly introduce a small number of reflection performance issues and discuss some feasible optimization solutions. However, the scenarios used by everyone are different, so optimization needs to be done based on actual conditions.

This is the end of this article about deeply understanding the reasons why go reflect slow reflection. For more related go reflect content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!