Inline optimization is a common compiler optimization strategy. In layman's terms, it is to expand the function where it is called, which can reduce the overhead caused by function calls (stack creation, parameter copying, etc.).
What is the specific manifestation when a function/method is inlined?
Observe inline
For example, now there is the following code
// ValidateName Verify that the given username is legal// //go:noinline func ValidateName(name string) bool { // AX: String pointer BX: String length if len(name) < 1 { return false } else if len(name) > 12 { return false } return true } //go:noinline func (s *Server) CreateUser(name string, password string) error { if !ValidateName(name) { return ("invalid name") } // ... return nil } type Server struct{}
For ease of understanding, I added functions and methods//go:noinline
Comments. The Go compiler does not inline the function/method when encountering this comment. Let's first look at the assembly instructions generated by this code when inline is prohibited:
// ... // ValidateName function// at this time:// AX register: pointer to name string array// BX register: length of name stringTEXT /bootun/example/(SB) /bootun/example/user/ :9 0x4602c0 MOVQ AX, 0x8(SP) // Save the pointer of the name string to the stack (not used later) :10 0x4602c5 TESTQ BX, BX // BX & BX, used to detect whether BX is 0, equivalent to: CMPQ 0, BX :10 0x4602c8 JE 0x4602d9 // If 0, jump to 0x4602d9 :12 0x4602ca CMPQ $0xc, BX // Compare the lengths of constant 12 and name :12 0x4602ce JLE 0x4602d3 // If less than or equal to 12, jump to 0x4602d3 :13 0x4602d0 XORL AX, AX // return false :13 0x4602d2 RET :15 0x4602d3 MOVL $0x1, AX // return true :15 0x4602d8 RET :11 0x4602d9 XORL AX, AX // return false :11 0x4602db RET // CreateUser methodTEXT /bootun/example/user.(*Server).CreateUser(SB) //bootun/example/user/ // Some preparations before function calls are omitted (register assignment and other operations) :20 0x460300 CALL (SB) :20 0x460305 TESTL AL, AL :20 0x460307 JE 0x460317 :24 0x460309 XORL AX, AX :24 0x46030b XORL BX, BX :24 0x46030d MOVQ 0x10(SP), BP :24 0x460312 ADDQ $0x18, SP :24 0x460316 RET :62 0x460317 LEAQ 0x9302(IP), AX :62 0x46031e NOPW :62 0x460320 CALL (SB) // ...
Only the most critical paragraphs are intercepted in the above compilation:ValidateName
Functions andCreateUser
method.
It doesn't matter if you can't understand the compilation, please pay attention to it.CreateUser
There is a line in the method:20
CALL
, ExplainCreateUser
Called within the methodValidateName
The function is exactly the same as our code.
Now let's remove the source codeValidateName
on the function//go:noinline
After compiling again, check the generated assembly instructions:
If you want to try it with the code in the article, please do not delete itCreateUser
Method//go:noinline
, because in the exampleCreateUser
Too short, the compiler will also optimize it inline, which is not convenient for us to experiment and observe
// CreateUser function// at this time:// AX register: method Recever, that is, Server structure// BX register: pointer to name string// CX register: length of name stringTEXT /bootun/example/user.(*Server).CreateUser(SB) //bootun/example/user/ // ... :18 0x4602d4 MOVQ BX, 0x28(SP) // Save the pointer of the name string to the stack :19 0x4602d9 TESTQ CX, CX // Verify whether the length of the name is 0 :9 0x4602dc JE 0x4602e6 // If it is 0, it will jump to 0x4602e6 :9 0x4602de NOPW :11 0x4602e0 CMPQ $0xc, CX // Compare the constant 12 and the length of the string :11 0x4602e4 JLE 0x460318 // If it is less than or equal to or more, it will jump to 0x460318 and continue execution (name is legal) :62 0x4602e6 LEAQ 0x9333(IP), AX // Construction error returns :62 0x4602ed CALL (SB) :62 0x4602f2 MOVQ $0xc, 0x8(AX) // ... :23 0x460318 XORL AX, AX // AX = 0 :23 0x46031a XORL BX, BX // BX = 0 :23 0x46031c MOVQ 0x10(SP), BP // Restore BP registers :23 0x460321 ADDQ $0x18, SP // Add stack pointer to reduce stack space :23 0x460325 RET // return // ...
Observe the code this time to find thatValidateName
The logic of the function is directly embedded inCreateUser
Expanded in the method. We can't search in the generated assembly code eitherValidateName
Related symbols are here. The current code is equivalent to:
func (s *Server) CreateUser(name string, password string) error { if len(name) < 1 { return ("invalid name") } else if len(name) > 12 { return ("invalid name") } return nil }
What kind of function will be inlined?
Inline related code incmd/compile/internal/inline/
, is part of the compiler. There is a comment at the top of the file, which summarizes the controls and rules of inline well:
// The flag controls the aggressiveness. Note that main() swaps level 0 and 1, // making 1 the default and -l disable. Additional levels (beyond -l) may be buggy and // are not supported. // 0: disabled // 1: 80-nodes leaf functions, oneliners, panic, lazy typechecking (default) // 2: (unassigned) // 3: (unassigned) // 4: allow non-leaf functions // // At some point this may get another default and become switch-offable with -N. // // The -d typcheckinl flag enables early typechecking of all imported bodies, // which is useful to flush out bugs. // // The flag enables diagnostic output. a single -m is useful for verifying
Let’s summarize the core part of the above passage:
- Leaf function of 80 nodes, oneliners, panic, lazy type checkWill be inlined
- use
-N -l
Let the compiler not be inlined - use
-m
Enable diagnostic output
That is to say,As long as our functions/methods are small enough, they may be inlined.Therefore, many people will use many small function combinations instead of large pieces of code to improve performance. For example, the mutex we often use (in the standard librarysync
In the bagMutex
) took advantage of this, what we usually useLock
There are only a few lines in the method:
func (m *Mutex) Lock() { // Fast path: grab unlocked mutex. if atomic.CompareAndSwapInt32(&, 0, mutexLocked) { if { ((m)) } return } // Slow path (outlined so that the fast path can be inlined) () }
Note the comments on the third last line:outlined so that the fast path can be inlined
,Use this feature,Lock
FastPath in it can be inlined into our program without requiring additional function calls, thereby improving the performance of the code.
The entry to the function inline part is a function, If you want to have an in-depth understanding, you can go and have a look.
How much performance improvement can inline bring to my program?
I have introduced so much inline before, and even the standard library deliberately uses inline to improve the performance of Go programs. So how much performance improvement can inline bring to us?
Let's expand the example mentioned at the beginning of the article:
package user import ( "errors" ) func ValidateName(name string) bool { if len(name) < 1 { return false } else if len(name) > 12 { return false } return true } //go:noinline func ValidateNameNoInline(name string) bool { if len(name) < 1 { return false } else if len(name) > 12 { return false } return true } func (s *Server) CreateUser(name string, password string) error { if !ValidateName(name) { return ("invalid name") } return nil } // CreateUserNoInline uses ValidateName that prohibits inline versionsfunc (s *Server) CreateUserNoInline(name string, password string) error { if !ValidateNameNoInline(name) { return ("invalid name") } return nil } type Server struct{}
We copiedValidateName
Function, marked on//go:noinline
to disable the compiler from inline optimization and rename it toValidateNameNoInline
. At the same time, we copied itCreateUser
Method, new methods are used internallyValidateNameNoInline
Come to verifyname
Parameters, except for this, all places are the same as the original method.
Let's write two Benchmark tests:
package user import "testing" // BenchmarkCreateUser tests the performance of inlined functionsfunc BenchmarkCreateUser(b *) { srv := Server{} for i := 0; i < ; i++ { if err := ("bootun", "123456"); err != nil { ("err: %v", err) } } } // BenchmarkValidateNameNoInline test function prohibits performance after inlinefunc BenchmarkValidateNameNoInline(b *) { srv := Server{} for i := 0; i < ; i++ { if err := ("bootun", "123456"); err != nil { ("err: %v", err) } } }
The test results are as follows:
#BenchmarkCreateUser for inline versions
goos: windows
goarch: amd64
pkg: /bootun/example/user
cpu: AMD Ryzen 7 6800H with Radeon Graphics
BenchmarkCreateUser
BenchmarkCreateUser-16 1000000000 0.2279 ns/op
PASS
# Prohibit inline version benchmark results (BenchmarkValidateNameNoInline)
goos: windows
goarch: amd64
pkg: /bootun/example/user
cpu: AMD Ryzen 7 6800H with Radeon Graphics
BenchmarkValidateNameNoInline
BenchmarkValidateNameNoInline-16 733243102 1.635 ns/op
PASS
It can be seen that each operation takes 1.6 nanoseconds after inlining is prohibited, while only 0.22 nanoseconds after inlining (varies from machine to machine). From a proportional perspective, the benefits brought by inline optimization are still considerable.
What do I need to do to enable inline optimization
Of course not required. In Go compiler, inline optimization is enabled by default. If your function complies with the inline optimization strategy mentioned in the article (such as the function is very small) and does not explicitly disable inline, it may be performed by the compiler.
In some scenarios, we may not want the function to be inlined (for example, usingdlv
When performing DEBUG, or when viewing the assembly code generated by the program), you can usego build -gcflags='-N -l'
to disable inline optimization.
The code optimized by the compiler by default may be difficult to read and understand, and is not convenient for us to debug and learn.
-gcflags
It is passed to the go compilergc
command line flag,go build
There are many things done behind the scenes, not only are they usedgc
A program. usego build -x
You can view detailed steps in the compilation process.
The above is an article that will help you learn more about inline optimization in Go. For more information about Go inline optimization, please pay attention to my other related articles!