SoFunction
Updated on 2025-03-05

Golang pprof monitoring memory block mutex statistics principle analysis

introduction

In previous articlegolang pprof monitoring series (2) —— memory, block, mutex useIn this article, I explained how these three performance indicators are exposed in the program and the scope of their respective monitoring. There are also mentions that memory, block, and mutex put these three types of data together because their statistics principles are very similar. Let’s see how they are counted today.

Let’s talk about the conclusion first. These three types are statistics in runtime through a structure called bucket. There is a pointer inside the bucket structure pointing to the next bucket, which forms a bucket linked list. Every time the memory is allocated, or every time the blockage is generated, it will determine whether a new bucket will be created to record the allocation information.

Let’s take a look at what information is in the bucket.

Introduction to bucket structure

// src/runtime/:48
type bucket struct {
	next    *bucket
	allnext *bucket
	typ     bucketType // memBucket or blockBucket (includes mutexProfile)
	hash    uintptr
	size    uintptr
	nstk    uintptr
}

Explain this bucket structure in detail one by one: first there are two pointers.A next pointer, an allnext pointer, the function of the allnext pointer is to form a linked list structure. Whenever the allocation information is mentioned just now, if a bucket is added, then the allnext pointer of this bucket will point to the bucket's linked list header.

The header information of the bucket's linked list is stored by a global variable, and the code is as follows:

// src/runtime/:140
var (
	mbuckets  *bucket // memory profile buckets
	bbuckets  *bucket // blocking profile buckets
	xbuckets  *bucket // mutex profile buckets
	buckhash  *[179999]*bucket

Different indicator types have different linked list header variables. mbuckets is the linked list header of memory indicators, bbuckets is the linked list header of block indicators, and xbuckets is the linked list header of mutex indicators.

There is also a buckethash structure here. No matter which indicator type, as long as the bucket structure is created, it will be stored in buckethash., and the way buckethash is used to resolve hash conflicts is to connect the conflicting buckets through pointers to form a linked list, which is the next pointer mentioned just now.

At this point, after explaining the next pointer and allnext pointer of bucket, let’s take a look at other properties of bucket.

// src/runtime/:48
type bucket struct {
	next    *bucket
	allnext *bucket
	typ     bucketType // memBucket or blockBucket (includes mutexProfile)
	hash    uintptr
	size    uintptr
	nstk    uintptr
}

typeThe meaning of attributes is very obvious, which means that bucket belongs to which indicator type.

hashIt is the hash value stored in the buckethash structure and the index value in the buckethash array.

sizeRecord the size of this allocation, which has this value for memory indicators, and the value of the remaining indicator types is 0.

nstkThis is to record the size of the stack information array when this allocation is recorded. Remember in the previous lecturegolang pprof monitoring series (2) —— memory, block, mutex useAre you seeing the stack information from the web page?

heap profile: 7: 5536 [110: 2178080] @ heap/1048576
2: 2304 [2: 2304] @ 0x100d7e0ec 0x100d7ea78 0x100d7f260 0x100d7f78c 0x100d811cc 0x100d817d4 0x100d7d6dc 0x100d7d5e4 0x100daba20
#	0x100d7e0eb	+0x8b		/Users/lanpangzi/goproject/src/go/src/runtime/:1881
#	0x100d7ea77	+0x37		/Users/lanpangzi/goproject/src/go/src/runtime/:2207
#	0x100d7f25f	+0x11f		/Users/lanpangzi/goproject/src/go/src/runtime/:2491
#	0x100d7f78b	+0xab		/Users/lanpangzi/goproject/src/go/src/runtime/:2590
#	0x100d811cb	+0x7b	/Users/lanpangzi/goproject/src/go/src/runtime/:3222
#	0x100d817d3	+0x2d3		/Users/lanpangzi/goproject/src/go/src/runtime/:3383
#	0x100d7d6db	runtime.mstart1+0xcb		/Users/lanpangzi/goproject/src/go/src/runtime/:1419
#	0x100d7d5e3	runtime.mstart0+0x73		/Users/lanpangzi/goproject/src/go/src/runtime/:1367
#	0x100daba1f	+0xf		/Users/lanpangzi/goproject/src/go/src/runtime/asm_arm64.s:117

nstk is the size of the recorded stack information array. When you see this, you may be confused.Here is just recording the stack size. What about the content of the stack? What about the records of allocated information?

To answer this question, you have to figure out how memory is allocated when creating a bucket structure.

First of all, we must understand that when the structure is allocated, it is a continuous piece of memory. For example, when introducing the bucket structure just now, several attributes are all in a continuous piece of memory. Of course, the address pointed to by the pointer may not be continuous with the structure memory, but the pointer itself is stored in this continuous piece of memory.

Next, let's see how runtime creates a bucket.

// src/runtime/:162
func newBucket(typ bucketType, nstk int) *bucket {
	size := (bucket{}) + uintptr(nstk)*(uintptr(0))
	switch typ {
	default:
		throw("invalid profile bucket type")
	case memProfile:
		size += (memRecord{})
	case blockProfile, mutexProfile:
		size += (blockRecord{})
	}
	b := (*bucket)(persistentalloc(size, 0, &memstats.buckhash_sys))
	bucketmem += size
	 = typ
	 = uintptr(nstk)
	return b
}

The above code is the source code when creating a bucket. Where persistentalloc is a method used to allocate memory within runtime, and the underlying layer is still used mmap. I won't expand it here. I just need to know that this method can allocate a piece of memory, and size is the memory size that needs to be allocated.

The pointer that can be forced to be converted into a bucket type after persistentalloc returns is a pointer that is allowed by the go compiler to represent a pointer to any type. So the key is to see how the memory space of this size is calculated when allocating a bucket structure.

First, we get the memory length required to allocate a bucket code structure, and then add nstk uintptr type memory lengths. uintptr represents a pointer type. Do you remember the function of nstk just mentioned? nstk indicates the size of the stack information array, and each element in the array is a uintptr type, pointing to the specific stack position.

Next, judge the type of bucket that needs to be created. If it is a memProfile memory type, use it to obtain the space occupied by a memRecord structure. If it is a blockProfile, or mutexProfile, add the space occupied by a blockRecord structure to the size. memRecord and blockRecord carry detailed information about this memory allocation or this blocking behavior.

// src/runtime/:59
type memRecord struct {
	active memRecordCycle
	future [3]memRecordCycle
}
// src/runtime/:120
type memRecordCycle struct {
	allocs, frees           uintptr
	alloc_bytes, free_bytes uintptr
}

The detailed information about memory allocation is finally hosted by memRecordCycle, which contains the memory size and the number of allocated objects this time. So what does active and future mean in memRecord? Why don’t you just use the memRecordCycle structure to represent the detailed information of this memory allocation? Here I will reserve a pit first and put it in the explanation below. Now you only need to know that when allocating a memory bucket structure, a piece of memory space is also allocated to record detailed information about memory allocation.

Then look at blockRecord.

// src/runtime/:135
type blockRecord struct {
	count  float64
	cycles int64
}

blockRecord is more concise. count represents the number of blocking times, and cycles represents the period duration of this blocking. For an explanation of the period, you can check out my previous articlegolang pprof monitoring series (2) —— memory, block, mutex use, in short, the period duration is a way for CPU to record the duration. You can understand it as a period of time, but the unit of time is not in seconds, but a cycle.

It can be seen that when calculating the space occupied by a bucket, in addition to the space occupied by the bucket structure itself, stack space and memory space occupied by the memRecord or blockRecord structure are also reserved.

You may be puzzled, assign a bucket structure in this way, so how to remove the memRecord or blockRecord structure in the bucket? The answer is to calculate the position of memRecord in the bucket and then force the pointer.

Take memRecord as an example.

//src/runtime/:187
func (b *bucket) mp() *memRecord {
	if  != memProfile {
		throw("bad use of ")
	}
	data := add((b), (*b)+*(uintptr(0)))
	return (*memRecord)(data)
}

The above address can be translated into the following formula:

The address starting with memRecord = the address of the bucket pointer + the memory footprint of the bucket structure + the stack array footprint length

The premise for this formula is that when allocating the structure, a piece of memory is allocated continuously, so we can of course calculate the address at the beginning of memRecord through the bucket header address and the space length in the middle.

At this point, the structure description of bucket has been introduced, but we have not yet penetrated into the details of recording indicator information. Let’s study the details of recording below, and the main drama begins.

Record indicator details introduction

Since the sampling of memory allocation is a bit different from the sampling of blocking information, I decided to introduce it in two parts. Let’s first look at how to record the memory allocation information during memory allocation.

memory

First in the previous articlegolang pprof monitoring series (2) —— memory, block, mutex useI have introduced MemProfileRate. MemProfileRate is used to control the sampling frequency of memory allocation, which means that the memory allocation record will be recorded once per MemProfileRate byte allocated.

When the record condition is triggered, runtime will call mProf_Malloc to record this memory allocation.

// src/runtime/:340
func mProf_Malloc(p , size uintptr) {
	var stk [maxStack]uintptr
	nstk := callers(4, stk[:])
	lock(&proflock)
	b := stkbucket(memProfile, size, stk[:nstk], true)
	c := 
	mp := ()
	mpc := &[(c+2)%uint32(len())]
	++
	mpc.alloc_bytes += size
	unlock(&proflock)
	systemstack(func() {
		setprofilebucket(p, b)
	})
}

Before the actual record, the stack information will be obtained first. In the above code, stk is an array of records the stack, and then the bucket allocated this time is obtained through stkbucket. The stkbucket will determine whether the same bucket existed before, and if it exists, it will be returned directly. To determine whether the same bucket exists, it depends on whether the allocated memory size and stack position of the existing bucket are consistent with the current one.

// src/runtime/:229
for b := buckhash[i]; b != nil; b =  {
		if  == typ &&  == h &&  == size && eqslice((), stk) {
			return b
		}
	}

By introducing the bucket structure just now, we can know that buckhash contains all buckets in the program. Through a piece of logic, the index value in the bucket, that is, the value of i, then take out the linked list of the corresponding index of buckhash, and loop to find whether there is the same bucket. The same will be returned directly and no new bucket will be created.

Let us return to the main logic of recording memory allocation. After the stkbucket method creates or obtains a bucket, it will obtain the internal memRecord structure through the mp() method, and then accumulate the bytes of the memory allocation this time into the memRecord structure.

However, here is not memRecord directly hosting the accumulation task, but memRecordCycle structure of memRecord.

c := 
	mp := ()
	mpc := &[(c+2)%uint32(len())]
	++
	mpc.alloc_bytes += size

Here, first, a memRecordCycle is taken from the future structure of the memRecord structure, and then accumulate the number of bytes on the memRecordCycle to accumulate the allocation times.

Here it is necessary to introduce the functions of active and future in memRecord.

We know that memory allocation is a continuous process, and memory recycling is performed regularly by gc. Golang designers believe that if memory allocation information is recorded every time the memory allocation behavior is generated, then it is very likely that although the program is no longer referenced, the memory allocation curve will be severely tilted since there is no garbage collection (because the memory will only be recorded as free after garbage collection, that is, the free_bytes in memRecordCycle will increase, so the memory allocation curve will continue to increase before gc and a sharp drop after gc).

Therefore, when recording memory allocation information, the current memory allocation information is recorded after a round of gc. It is the number of cycles of the current gc. Each time gc is added 1. When recording memory allocation, the index value after the current cycle number is added 2 and future modulo is recorded to future. When freeing memory, the index value after the current cycle number is added 1 and future modulo is recorded to future. Think about why 1 is needed to get the corresponding memRecordCycle? Because the current number of periods has been added by 1 compared to the number of periods allocated by the memory, it is better to add only 1 when releasing.

// src/runtime/:362
func mProf_Free(b *bucket, size uintptr) {
	lock(&proflock)
	c := 
	mp := ()
	mpc := &[(c+1)%uint32(len())]
	++
	mpc.free_bytes += size
	unlock(&proflock)
}

When recording memory allocation, it will only record in the future array. How to read data of memory allocation information?

Do you remember that there is an active property of type memRecordCycle in memRecord? When reading, runtime will call the mProf_FlushLocked() method to read the future data of the current cycle into the active.

// src/runtime/:59
type memRecord struct {
	active memRecordCycle
	future [3]memRecordCycle
}
// src/runtime/:120
type memRecordCycle struct {
	allocs, frees           uintptr
	alloc_bytes, free_bytes uintptr
}
// src/runtime/:305
func mProf_FlushLocked() {
	c := 
	for b := mbuckets; b != nil; b =  {
		mp := ()
		// Flush cycle C into the published profile and clear
		// it for reuse.
		mpc := &[c%uint32(len())]
		(mpc)
		*mpc = memRecordCycle{}
	}
}

The code is easier to understand. It obtains the current gc cycle, and then uses the current cycle to retrieve the memory allocation information of the current gc cycle from the future and assigns it to acitve. This is the assignment of each memory bucket.

After the assignment is completed, when the current memory allocation information is read, it will only read the data in the active. At this point, it has been explained how runtime counts the memory indicators.

Next, let’s take a look at how to count block and mutex indicators.

block mutex

The statistics of block and mutex are recorded by the same method, saveblockevent, but the method still handles a little differently for these two types.

It is necessary to note that mutex will record a blocking behavior when unlocking, and when block records mutex lock blocking information, it is recorded when starting to execute the lock call. In addition, blocking behavior will also record a blocking behavior when select blocking, channel channel blocking, and wait group blocking.

// src/runtime/:417
func saveblockevent(cycles, rate int64, skip int, which bucketType) {
	gp := getg()
	var nstk int
	var stk [maxStack]uintptr
	if  == nil ||  == gp {
		nstk = callers(skip, stk[:])
	} else {
		nstk = gcallers(, skip, stk[:])
	}
	lock(&proflock)
	b := stkbucket(which, 0, stk[:nstk], true)
	if which == blockProfile && cycles < rate {
		// Remove sampling bias, see discussion on /cl/299991.
		().count += float64(rate) / float64(cycles)
		().cycles += rate
	} else {
		().count++
		().cycles += cycles
	}
	unlock(&proflock)
}

First, get the stack information, then the stkbucket() method obtains a bucket structure, and then the bp() method obtains the blockRecord structure in the bucket, and accumulates the count times and cycles blocking cycle duration.

// src/runtime/:135
type blockRecord struct {
	count  float64
	cycles int64
}

Pay attention to the accumulation of blockProfile type. Special treatment has been carried out. Remember the previous articlegolang pprof monitoring series (2) —— memory, block, mutex useIs the BlockProfileRate parameter mentioned? It is used to set the nanosecond sampling rate of block sampling. If the blocking cycle duration cycles is less than BlockProfileRate, the fastrand function needs to be multiplied by the set nanosecond time BlockProfileRate to determine whether to sample. Therefore, if it is less than BlockProfileRate and saveblockevent records blocking information, it means that we have only sampled some blocking in this case. Therefore, the number of times is divided by BlockProfileRate by the number of blocking cycles this time to obtain an estimated total number of such blocking times.

It is very simple to read the count and periods of the blocking bucket directly.

Summarize

At this point, the statistical principles of these three indicator types have been introduced. In short, it is to sample and record each memory allocation or blocking behavior through a bucket carrying stack information. When the memory allocation information or blocking indicator information is read, all bucket information is read out.

The above is the detailed content of the analysis of the statistical principles of golang pprof monitoring memory block mutex. For more information about golang pprof monitoring statistics, please pay attention to my other related articles!