Implementation method in golang

Implementation idea of mutex

There are two main methods of mutex:Lock()andUnlock()

Lock()It can be implemented through a CAS operation

func (m *Mutex) Lock() {
	for !atomic.CompareAndSwapUint32(&, 0, 1) {
	}
}

func (m *Mutex) Unlock() {
	atomic.StoreUint32(&, 0)
}

Lock() keeps performing CAS operations, which consumes more CPU. This brings about an optimization: if the coroutine cannot grab the lock for a period of time, it can hang the coroutine on a waiting queue.Unlock()In addition to updating the lock status, the party of also needs to wake up a coroutine from the waiting queue.

However, there is a problem with this optimization. If a coroutine wakes up from the waiting queue and grabs the lock again, the lock has been snatched by a new coroutine, and it can only be hung into the waiting queue again and then wakes up again, but the lock grab may fail... This sad coroutine may not be able to grab the lock, resulting in starvation.

Hunger can cause particularly high tail delays. What is tail delay? In one sentence, it is:The slowest one is particularly slow!

If there are 1,000 coroutines in total, suppose 999 coroutines can grab the lock within 1ms. Although the average time is only 2ms, the slowest coroutine needs 1s to grab the lock, which is the tail delay.

Implementation idea of mutex in golang

➜  go version
go version go1.16.5 darwin/arm64

The source code version of go read this time is go1.16.5.

The mutex in golang standard library avoids the occurrence of hunger. Let’s first briefly introduce the locking and unlocking process of golang, which is helpful for the subsequent source code reading.

There are two types of lock modes, namely normal mode and starvation mode. Initially normal mode, when a coroutine comes to grab the lock, it still does a CAS operation. If it succeeds, it will return directly. If the lock is not grabbed, it will perform a certain number of spin operations, waiting for the lock to be released. After the spin operation is completed, if the lock is still not released, then the coroutine will be placed in the waiting queue. If a coroutine in the waiting queue has not grabbed the lock, mutex will change from normal mode to starvation mode. Under starvation mode, when a coroutine releases the lock, the lock will be handed directly to the coroutine in the waiting queue, thereby avoiding starving threads.

In addition, golang has a little optimization. When a coroutine is spinning the lock,Unlock()The party will not wake up the coroutine from the waiting queue, because even if it wakes up, the awakened coroutine cannot grab the spinning coroutine.

Let’s start reading the source code officially.

The structure of mutex and some const constant values

type Mutex struct {
	state int32
	sema  uint32
}

const (
	mutexLocked = 1 &lt;&lt; iota // mutex is locked
	mutexWoken				
	mutexStarving
	mutexWaiterShift = iota // 3

	// Mutex fairness.
	//
	// Mutex can be in 2 modes of operations: normal and starvation.
	// In normal mode waiters are queued in FIFO order, but a woken up waiter
	// does not own the mutex and competes with new arriving goroutines over
	// the ownership. New arriving goroutines have an advantage -- they are
	// already running on CPU and there can be lots of them, so a woken up
	// waiter has good chances of losing. In such case it is queued at front
	// of the wait queue. If a waiter fails to acquire the mutex for more than 1ms,
	// it switches mutex to the starvation mode.
	//
	// In starvation mode ownership of the mutex is directly handed off from
	// the unlocking goroutine to the waiter at the front of the queue.
	// New arriving goroutines don't try to acquire the mutex even if it appears
	// to be unlocked, and don't try to spin. Instead they queue themselves at
	// the tail of the wait queue.
	//
	// If a waiter receives ownership of the mutex and sees that either
	// (1) it is the last waiter in the queue, or (2) it waited for less than 1 ms,
	// it switches mutex back to normal operation mode.
	//
	// Normal mode has considerably better performance as a goroutine can acquire
	// a mutex several times in a row even if there are blocked waiters.
	// Starvation mode is important to prevent pathological cases of tail latency.
	starvationThresholdNs = 1e6
)

The state of mutex is throughstateTo maintain,stateThere are 32 bits.

The first 29 bits are used to record how many coroutines are waiting in the current waiting queue, and record the number of coroutines in the waiting queue as waiterCount.

state &gt;&gt; mutexWaiterShift // The value of mutexWaiterShift is 3

The 30th bit indicates whether the current mutex is in the starvation mode, and denote this bit as starvationFlag.

state & mutexStarving

The 31st bit indicates whether there is currently a coroutine spinning (first time). Describe this bit as wokenFlag. The meaning of woken is to wake up, which means it is not waiting for sleep on the queue.

state & mutexWoken

The 32nd bit indicates whether the current lock is locked (it feels a bit twisted haha). This bit is marked as lockFlag.

state & mutexLocked

Use a graph to represent these bits

0 0 0 0 0 0 0 0 ... 0 0				0			0
                      |				|			|
waiterCount     starvationFlag  wokenFlag   lockFlag

semais a semaphore, which will be used to associate a waiting queue.

The execution of the code under several cases is discussed separately.

Mutex is not locked, the first coroutine comes to get the lock

func (m *Mutex) Lock() {
	// Fast path: grab unlocked mutex.
	if atomic.CompareAndSwapInt32(&, 0, mutexLocked) {
		// ...
		return
	}
	// Slow path (outlined so that the fast path can be inlined)
	()
}

When Mutex is not locked, the value of state is 0. When the first coroutine comes to get the lock, since the value of state is 0, the CAS operation will be successful. The value of state after the CAS operation will become 1 (lockFlag = 1), and then return it and will not enter.()in.

Mutex is only locked by coroutine A, no other coroutine grabs the lock, coroutine A releases the lock

func (m *Mutex) Unlock() {
	// ...

	// Fast path: drop lock bit.
	new := atomic.AddInt32(&, -mutexLocked)
	if new != 0 {
		// Outlined slow path to allow inlining the fast path.
		// To hide unlockSlow during tracing we skip one extra frame when tracing GoUnblock.
		(new)
	}
}

Immediately above, the value of state is 1,AddInt32(,-1)After that, the value of state becomes 0 (lockFlag = 0), the value of new is 0, and then it returns.

Mutex has been locked by coroutine A, coroutine B comes to get the lock

func (m *Mutex) Lock() {
	// Fast path: grab unlocked mutex.
	if atomic.CompareAndSwapInt32(&, 0, mutexLocked) {
		// ...
		return
	}
	// Slow path (outlined so that the fast path can be inlined)
	()
}

Because the value of state is not 0, CompareAndSwapInt32 will return false, so it will enter lockSlow()

lockSlow()

First, let's take a look at the full picture of lockSlow() method

func (m *Mutex) lockSlow() {
	var waitStartTime int64
	starving := false
	awoke := false
	iter := 0
	old := 
	for {
		// Don't spin in starvation mode, ownership is handed off to waiters
		// so we won't be able to acquire the mutex anyway.
		if old&(mutexLocked|mutexStarving) == mutexLocked && runtime_canSpin(iter) {
			// Active spinning makes sense.
			// Try to set mutexwokenFlag to inform Unlock
			// to not wake other blocked goroutines.
			if !awoke && old&mutexWoken == 0 && old>>mutexWaiterShift != 0 &&
				atomic.CompareAndSwapInt32(&, old, old|mutexWoken) {
				awoke = true
			}
			runtime_doSpin()
			iter++
			old = 
			continue
		}
		new := old
		// Don't try to acquire starving mutex, new arriving goroutines must queue.
		if old&mutexStarving == 0 {
			new |= mutexLocked
		}
		if old&(mutexLocked|mutexStarving) != 0 {
			new += 1 << mutexWaiterShift
		}
		// The current goroutine switches mutex to starvation mode.
		// But if the mutex is currently unlocked, don't do the switch.
		// Unlock expects that starving mutex has waiters, which will not
		// be true in this case.
		if starving && old&mutexLocked != 0 {
			new |= mutexStarving
		}
		if awoke {
			// The goroutine has been woken from sleep,
			// so we need to reset the flag in either case.
			if new&mutexWoken == 0 {
				throw("sync: inconsistent mutex state")
			}
			new &^= mutexWoken
		}
		if atomic.CompareAndSwapInt32(&, old, new) {
			if old&(mutexLocked|mutexStarving) == 0 {
				break // locked the mutex with CAS
			}
			// If we were already waiting before, queue at the front of the queue.
			queueLifo := waitStartTime != 0
			if waitStartTime == 0 {
				waitStartTime = runtime_nanotime()
			}
			runtime_SemacquireMutex(&, queueLifo, 1)
			starving = starving || runtime_nanotime()-waitStartTime > starvationThresholdNs
			old = 
			if old&mutexStarving != 0 {
				// If this goroutine was woken and mutex is in starvation mode,
				// ownership was handed off to us but mutex is in somewhat
				// inconsistent state: mutexLocked is not set and we are still
				// accounted as waiter. Fix that.
				if old&(mutexLocked|mutexWoken) != 0 || old>>mutexWaiterShift == 0 {
					throw("sync: inconsistent mutex state")
				}
				delta := int32(mutexLocked - 1<<mutexWaiterShift)
				if !starving || old>>mutexWaiterShift == 1 {
					// Exit starvation mode.
					// Critical to do it here and consider wait time.
					// Starvation mode is so inefficient, that two goroutines
					// can go lock-step infinitely once they switch mutex
					// to starvation mode.
					delta -= mutexStarving
				}
				atomic.AddInt32(&, delta)
				break
			}
			awoke = true
			iter = 0
		} else {
			old = 
		}
	}

	if  {
		((m))
	}
}

Step 1: doSpin (idle)

After entering the for loop, a judgment will be performed

for {
    // Don't spin in starvation mode, ownership is handed off to waiters
    // so we won't be able to acquire the mutex anyway.
    if old&(mutexLocked|mutexStarving) == mutexLocked && runtime_canSpin(iter) {
        // Active spinning makes sense.
        // Try to set mutexwokenFlag to inform Unlock
        // to not wake other blocked goroutines.
        if !awoke && old&mutexWoken == 0 && old>>mutexWaiterShift != 0 &&
        atomic.CompareAndSwapInt32(&, old, old|mutexWoken) {
            awoke = true
        }
        runtime_doSpin()
        iter++
        old = 
        continue
    }
    // ...
}

runtime_canSpin(iter)The function is to judge whether the self should be spinned based on the value of iter. (The implementation of this method can be seen later)

For the first few judgments, since iter's value is 0, runtime_canSpin(iter) will return true. therefore

if old&amp;(mutexLocked|mutexStarving) == mutexLocked &amp;&amp; runtime_canSpin(iter)

This judgment will always be passed, becauseold>>mutexWaiterShift = 0(waiterCount = 0) , the second judgment condition does not meet, so the CAS operation andawoke = true。

Then executeruntime_doSpin()Now,runtime_doSpin()There will be some empty loops, which consumes CPU time, and then passescontinueEnter the next cycle. (runtime_doSpinThe specific implementation can also be seen later)

I saw that this code is not used to grab the lock, but to wait for the lock to become unlocked. It will idle for a certain number of times, hoping that during the idle process, the lock will be released by other coroutines.

runtime_doSpin()

// src/runtime/lock_sema.go
const active_spin_cnt = 30
//go:linkname sync_runtime_doSpin sync.runtime_doSpin
//go:nosplit
func sync_runtime_doSpin() {
	procyield(active_spin_cnt)
}

# /src/runtime/asm_amd64.s
TEXT runtime·procyield(SB),NOSPLIT,$0-0
	MOVL	cycles+0(FP), AX
again:
	PAUSE
	SUBL	$1, AX
	JNZ	again
	RET

procyield()Will execute cyclicallyPAUSEInstructions.

runtime_canSpin()

The implementation of runtime_canSpin() is in src/runtime/, and there are many judgments in it, but we only need to pay attention to it.i >= active_spinThis judgment is enough.

const active_spin     = 4
// Active spinning for .
//go:linkname sync_runtime_canSpin sync.runtime_canSpin
//go:nosplit
func sync_runtime_canSpin(i int) bool {
	//  is cooperative, so we are conservative with spinning.
	// Spin only few times and only if running on a multicore machine and
	// GOMAXPROCS>1 and there is at least one other running P and local runq is empty.
	// As opposed to runtime mutex we don't do passive spinning here,
	// because there can be work on global runq or on other Ps.
	if i >= active_spin || ncpu <= 1 || gomaxprocs <= int32(+)+1 {
		return false
	}
	if p := getg().(); !runqempty(p) {
		return false
	}
	return true
}

A small episode

When using breakpoints to debug, I found that there was no way to watch some global variables referenced in sync_runtime_canSpin(), such asactive_spin,ncpu, These，So I'll do a miracle，Forcefully modify the source code and declare several local variables in it，This can be passed watch Local variables are used to know the value of the global variable (As clever as me haha) 。

func sync_runtime_canSpin(i int) bool {
	local_active_spin := active_spin
	local_ncpu := ncpu
	local_gomaxprocs := gomaxprocs
	npidle := 
	nmspinning := 
	if i >= local_active_spin || local_ncpu <= 1 ||local_gomaxprocs <= int32(npidle+nmspinning)+1 {
		return false
	}
	if p := getg().(); !runqempty(p) {
		return false
	}
	return true
}

Step 2: Calculate the new state based on the old state

new := old
// Don't try to acquire starving mutex, new arriving goroutines must queue.
if old&mutexStarving == 0 {
    new |= mutexLocked
}
if old&(mutexLocked|mutexStarving) != 0 {
    new += 1 << mutexWaiterShift
}
// The current goroutine switches mutex to starvation mode.
// But if the mutex is currently unlocked, don't do the switch.
// Unlock expects that starving mutex has waiters, which will not
// be true in this case.
if starving && old&mutexLocked != 0 {
    new |= mutexStarving
}
if awoke {
    // The goroutine has been woken from sleep,
    // so we need to reset the flag in either case.
    // ...
    new &^= mutexWoken
}

This piece of code calculates new state based on old state, with 4 operations

set lockFlag: new |= mutexLocked
Add waiterCount:new += 1 << mutexWaiterShift
set starvationFlag: new |= mutexStarving
clear wokenFlag: new &^= mutexWoken

Since we only discuss here "Mutex has been locked by coroutine A, coroutine B comes to get the lock". This situation can be divided into two cases

case1: During the first spin process, the lock has been released, at this time old state =000000...000(All bits are 0). After baptism of these four operations, lockFlag is set to 1.
case2: After the first step of spin ends, the lock has not been released, that is, old state is00000000...001(LockFlag is 1 only), after baptism of these four operations, waiterCounter = 1, lockFlag is also 1.

Step 3: Update state (Get lock)

if atomic.CompareAndSwapInt32(&, old, new) {
    if old&(mutexLocked|mutexStarving) == 0 {
        break // locked the mutex with CAS
    }
    // ...
} else {
    old = 
}

This step will be done through CAS operation.Updated to what we just calculatednew state. If CAS is successful and old is in an unlocked state, it will use break to exit the loop and return (that is, case1 above). If CAS fails, the value of old state will be updated, the next loop will be performed, and then one, two or three steps will be repeated;

If it is case2, the situation will be a little more complicated

if atomic.CompareAndSwapInt32(&, old, new) {
	// ...
    // If we were already waiting before, queue at the front of the queue.
    queueLifo := waitStartTime != 0
    if waitStartTime == 0 {
        waitStartTime = runtime_nanotime()
    }
    
    runtime_SemacquireMutex(&, queueLifo, 1)
    // ...
}

Mainly throughruntime_SemacquireMutex()Put yourself into the waiting queue, and then runtime will no longer schedule the coroutine until the coroutine is awakened.

aboutruntime_SemacquireMutex()I won’t pursue the implementation for the time being, and if I continue to pursue it, it will be endless.

Mutex is locked by coroutine A. Coroutine B comes to grab the lock but fails to be put into the waiting queue. At this time, coroutine A releases the lock

func (m *Mutex) Unlock() {
	// Fast path: drop lock bit.
	new := atomic.AddInt32(&, -mutexLocked)
	if new != 0 {
		// Outlined slow path to allow inlining the fast path.
		// To hide unlockSlow during tracing we skip one extra frame when tracing GoUnblock.
		(new)
	}
}

Immediately after the last time, the initial state value was00000000000...0001001(waiterCount = 1, lockFlag = 1). Completed executionAddInt32(&, -mutexLocked)After that, it became0000...001000 (waiterCount = 1) ，newThe value of is also0000...001000, then enterunlockSlowIt's inside.

unlockSlow()

have a lookunlockSlow()The full picture of

func (m *Mutex) unlockSlow(new int32) {
	if (new+mutexLocked)&mutexLocked == 0 {
		throw("sync: unlock of unlocked mutex")
	}
	if new&mutexStarving == 0 {
		old := new
		for {
			// If there are no waiters or a goroutine has already
			// been woken or grabbed the lock, no need to wake anyone.
			// In starvation mode ownership is directly handed off from unlocking
			// goroutine to the next waiter. We are not part of this chain,
			// since we did not observe mutexStarving when we unlocked the mutex above.
			// So get off the way.
			if old>>mutexWaiterShift == 0 || old&(mutexLocked|mutexWoken|mutexStarving) != 0 {
				return
			}
			// Grab the right to wake someone.
			new = (old - 1<<mutexWaiterShift) | mutexWoken
			if atomic.CompareAndSwapInt32(&, old, new) {
				runtime_Semrelease(&, false, 1)
				return
			}
			old = 
		}
	} else {
		// Starving mode: handoff mutex ownership to the next waiter, and yield
		// our time slice so that the next waiter can start to run immediately.
		// Note: mutexLocked is not set, the waiter will set it after wakeup.
		// But mutex is still considered locked if mutexStarving is set,
		// so new coming goroutines won't acquire it.
		runtime_Semrelease(&, true, 1)
	}
}

At this time old >> mutexWaiterShift =0000...0001≠ 0, so it will not return directly

old := new
for {
    // If there are no waiters or a goroutine has already
    // been woken or grabbed the lock, no need to wake anyone.
    // In starvation mode ownership is directly handed off from unlocking
    // goroutine to the next waiter. We are not part of this chain,
    // since we did not observe mutexStarving when we unlocked the mutex above.
    // So get off the way.
    if old>>mutexWaiterShift == 0 || old&(mutexLocked|mutexWoken|mutexStarving) != 0 {
        return
    }
    // Grab the right to wake someone.
    new = (old - 1<<mutexWaiterShift) | mutexWoken
    if atomic.CompareAndSwapInt32(&, old, new) {
        runtime_Semrelease(&, false, 1)
        return
    }
    old = 
}

Then calculate new =0000...1000 - 0000...1000 = 0000...0000, waiterCount changed from 1 to 0. Then perform the CAS operation and if CAS succeeds, wake up a goroutine from the waiting queue.

Mutex is locked by coroutine A. Coroutine B comes to grab the lock but fails and is put into the waiting queue. At this time, coroutine A releases the lock and coroutine B is awakened.

Let's cut our sight tolockSlowThe second half.

const starvationThresholdNs = 1e6
if atomic.CompareAndSwapInt32(&, old, new) {
    // ...
    runtime_SemacquireMutex(&, queueLifo, 1)
    starving = starving || runtime_nanotime()-waitStartTime > starvationThresholdNs
    old = 
	// ...
    iter = 0
}

When coroutine B fromruntime_SemacquireMutexAfter waking up, you will judge whether you are hungry based on the waiting time of the coroutine. Here we first assume that there is no hunger at this time, and we will discuss in detail the situation when hunger is in progress. Will beiterReset to 0, and then the next loop will be carried out, becauseiterhas been reset to 0, so on the next loop,sync_runtime_doSpin(iter)Will returntrue。

Since state has become 0 at this time, you can get the lock unimpeded in the next loop.

Unlocking behavior in the case of starvation: the role of starvationFlag

Imagine a situation where goroutine A gets the lock, goroutine B fails to grab the lock and is placed in the waiting queue. goroutine A releases the lock, goroutine B is awakened, but just when it grabs the lock, the lock is snatched by the new goroutine C... Several times in a row, whenever goroutine B wants to grab the lock, the lock is taken away by other coroutines first. Until one time, goroutine B is awakened again and executed

starving = starving || runtime_nanotime()-waitStartTime > starvationThresholdNs

It's going to starvation mode!

// The current goroutine switches mutex to starvation mode.
// But if the mutex is currently unlocked, don't do the switch.
// Unlock expects that starving mutex has waiters, which will not
// be true in this case.
if starving && old&mutexLocked != 0 {
    new |= mutexStarving
}

Then, the hunger flag is set to Inside, then it is put into the waiting queue again.

atomic.CompareAndSwapInt32(&, old, new)

Unlock()

Switch view angle to Unlock()

func (m *Mutex) Unlock() {
	// ...
	// Fast path: drop lock bit.
	new := atomic.AddInt32(&, -mutexLocked)
	if new != 0 {
		// Outlined slow path to allow inlining the fast path.
		// To hide unlockSlow during tracing we skip one extra frame when tracing GoUnblock.
		(new)
	}
}

func (m *Mutex) unlockSlow(new int32) {
	// ...
	if new&mutexStarving == 0 {
		// ...
		for {
			// ...
			if atomic.CompareAndSwapInt32(&, old, new) {
				runtime_Semrelease(&, false, 1)
				return
			}
            // ...
		}
	} else {
		// Starving mode: handoff mutex ownership to the next waiter, and yield
		// our time slice so that the next waiter can start to run immediately.
		// Note: mutexLocked is not set, the waiter will set it after wakeup.
		// But mutex is still considered locked if mutexStarving is set,
		// so new coming goroutines won't acquire it.
		runtime_Semrelease(&, true, 1)
	}
}

existunlockSlow()In this timenew&mutexStarving != 0, so it will directly enter the else branch and callruntime_Semrelease()Method, but be careful in the else branchruntime_Semrelease()The parameters of are different from those of the if branch, hereruntime_Semrelease(&, true, 1)The function is to wake up the first coroutine in the waiting queue and schedule the coroutine immediately (runtime_Semrelease()A detailed explanation of the method is later).

At the same time, as the comment says,Unlock()Due to the progressatomic.AddInt32(&, -mutexLocked)Operation, so the lockFlag is 0, but it doesn't matter, starvationFlag is 1, so it will still be considered locked.

Lock()

func (m *Mutex) Lock() {
	// ...
	()
}

func (m *Mutex) lockSlow() {
	// ...
	for {
		// ...
		if atomic.CompareAndSwapInt32(&, old, new) {
			// ...
			runtime_SemacquireMutex(&, queueLifo, 1)
			// ...
			old = 
			if old&mutexStarving != 0 {
				// If this goroutine was woken and mutex is in starvation mode,
				// ownership was handed off to us but mutex is in somewhat
				// inconsistent state: mutexLocked is not set and we are still
				// accounted as waiter. Fix that.
				// ...
				delta := int32(mutexLocked - 1<<mutexWaiterShift)
				if !starving || old>>mutexWaiterShift == 1 {
					// Exit starvation mode.
					// Critical to do it here and consider wait time.
					// Starvation mode is so inefficient, that two goroutines
					// can go lock-step infinitely once they switch mutex
					// to starvation mode.
					delta -= mutexStarving
				}
				atomic.AddInt32(&, delta)
				break
			}
			awoke = true
			iter = 0
		} else {
			old = 
		}
	}
	// ...
}

Switch the perspective toLock()Here, after the hungry goroutine is awakened and dispatched, it is executed firstold = , at this time the starvationFlag of old = 1.

Then, as the comment says, it tries to fix the "inconsistent" state.

The repair work mainly does three things:

Unlock() under starvation mode does not reduce waitterCount - 1, so here you need to reduce mutexWaiter by 1
Set state locked flag to 1
If the goroutine is not hungry or is waiting for the last goroutine in the queue, clean the starvationFlag

These three things passedatomic.AddInt32(&, delta)In place in one step.

runtime_Semrelease()

// Semrelease atomically increments *s and notifies a waiting goroutine
// if one is blocked in Semacquire.
// It is intended as a simple wakeup primitive for use by the synchronization
// library and should not be used directly.
// If handoff is true, pass count directly to the first waiter.
// skipframes is the number of frames to omit during tracing, counting from
// runtime_Semrelease's caller.
func runtime_Semrelease(s *uint32, handoff bool, skipframes int)

handoff means passing. When handoff is false, it only wakes up the first coroutine in the waiting queue, but it will not immediately schedule the coroutine; when handoff is true, it will immediately schedule the coroutine. In addition, when handoff = true, the coroutine will inherit the time slice of the current coroutine. For a specific example, assuming that the time slice of each goroutine is 2ms, gorounte A has executed 1ms, assuming that it wakes up goroutine B through runtime_Semrelease(handoff = true), then the remaining time slice of goroutine B is 2 - 1 = 1ms.

The locking behavior of new goroutines in hunger mode: the role of starvationFlag

If in hunger mode, there is a new goroutine to request the lock, it will perform the following steps

func (m *Mutex) lockSlow() {
    // ...
	old := 
	for {
		// Don't spin in starvation mode, ownership is handed off to waiters
		// so we won't be able to acquire the mutex anyway.
		if old&(mutexLocked|mutexStarving) == mutexLocked && runtime_canSpin(iter) {
			// ...
			runtime_doSpin()
		}
		new := old
		// Don't try to acquire starving mutex, new arriving goroutines must queue.
		if old&mutexStarving == 0 {
			new |= mutexLocked
		}
		if old&(mutexLocked|mutexStarving) != 0 {
			new += 1 << mutexWaiterShift
		}
		// ...
		if atomic.CompareAndSwapInt32(&, old, new) {
			// ..
			runtime_SemacquireMutex(&, queueLifo, 1)
			// ...
		} else {
			// ...
		}
	}
	// ...
}

becauseold&(mutexLocked|mutexStarving) != mutexLocked, so itWon'tSpin.

becauseold&mutexStarving != 0, so itWon't set lockFlag。

becauseold&(mutexLocked|mutexStarving) != 0, so itmeetingAdd waiterCount.

As you can see, it actually made an increasewaiterCountIn this operation, the state of the state is updated through CAS. After the update is completed, you will run to wait for the queue to sleep.

Therefore, in a hungry state, new goroutines that are scrambling for locks will not snatch lockFlags, they will only register (waiterCount + 1) and then join the waiting queue obediently.

Unlocking behavior when coroutines are spinning: the role of wokenFlag

wokenFlag is set in lockSlow(). When wokenFlag is 1, it means that a coroutine is spinning at this time.

func (m *Mutex) lockSlow() {
	starving := false
	awoke := false
	iter := 0
	old := 
	for {
		// Don't spin in starvation mode, ownership is handed off to waiters
		// so we won't be able to acquire the mutex anyway.
		if old&(mutexLocked|mutexStarving) == mutexLocked && runtime_canSpin(iter) {
			// Active spinning makes sense.
			// Try to set mutexwokenFlag to inform Unlock
			// to not wake other blocked goroutines.
			if !awoke && old&mutexWoken == 0 && old>>mutexWaiterShift != 0 &&
				atomic.CompareAndSwapInt32(&, old, old|mutexWoken) {
				awoke = true
			}
			runtime_doSpin()
			iter++
			old = 
			continue
		}
		// ...
		if atomic.CompareAndSwapInt32(&, old, new) {
			// ...
			runtime_SemacquireMutex(&, queueLifo, 1)
			// ...
			awoke = true
			iter = 0
		} 
        // ...
	}
    // ...
}

When a new coroutine (never put into the waiting queue) is the first spin, the setting logic of wokenFlag is:

if !awoke && old&mutexWoken == 0 && old>>mutexWaiterShift != 0 &&
	atomic.CompareAndSwapInt32(&, old, old|mutexWoken) {
    awoke = true
}

However, when the coroutine is awakened from the waiting queue,lockSlow()The logic for setting wokenFlag cannot be found, why? Because this logic was putunlockSlowIt's inside.

Switch sight tounlockSlow()That side

func (m *Mutex) unlockSlow(new int32) {
	// ...
	if new&amp;mutexStarving == 0 {
		old := new
		for {
			// If there are no waiters or a goroutine has already
			// been woken or grabbed the lock, no need to wake anyone.
			// In starvation mode ownership is directly handed off from unlocking
			// goroutine to the next waiter. We are not part of this chain,
			// since we did not observe mutexStarving when we unlocked the mutex above.
			// So get off the way.
			if old&gt;&gt;mutexWaiterShift == 0 || old&amp;(mutexLocked|mutexWoken|mutexStarving) != 0 {
                // When mutexwokenFlag is set, it will directly return                // I won't wait for the queue to wake up goroutine				return
			}
			// Grab the right to wake someone.
            // This place will set up wokenFlag			new = (old - 1&lt;&lt;mutexWaiterShift) | mutexWoken
			if atomic.CompareAndSwapInt32(&amp;, old, new) {
				runtime_Semrelease(&amp;, false, 1)
				return
			}
			old = 
		}
	} else {
		// ...
	}
}

It can be seen that when a coroutine is spinning (wokenFlag = 1), the coroutine will not be awakened from the waiting queue, which will avoid the competition of the coroutines on the waiting queue. Of course, the coroutines in the spin will still compete with each other; if wokenFlag = 0, a coroutine will be awakened from the waiting queue. Before awakening, the wokenFlag will be set to 1, so that the coroutines no longer need to set wokenFlag after the coroutine is awakened. That's amazing!

Why don't you wait for the coroutine to wake up in the queue when there is a coroutine spin? It takes time for coroutines to be awakened to scheduled (execute on the CPU), and mutex will be snatched away by the real spin.

If the coroutine still fails to grab the lock after waiting for the queue to be awakened, will it be placed at the header or the tail of the queue?

But it's the header, the code is as follows:

// If we were already waiting before, queue at the front of the queue.
queueLifo := waitStartTime != 0
if waitStartTime == 0 {
    waitStartTime = runtime_nanotime()
}
runtime_SemacquireMutex(&amp;, queueLifo, 1)

Complex scenario analysis

Based on the above logic, let’s analyze the complex logic!

Suppose there is a coroutine g1, g2, g3, g4, g5, g6, and fight for a lock m

At the beginning g1 gets the lock

owner: g1 waitqueue: null

g2 started to grab the lock, but it was not grabbed, and was put into the waiting queue

owner: g1 waitqueue: g2

g1 releases the lock, g2 is awakened from the waiting queue

owner: null waitqueue: null

At this time, g3 also started to grab the lock, g2 did not grab it, and was put back to the waiting queue

owner: g3 waitqueue: g2

g4 started to grab the lock, but it was not grabbed, and was put into the waiting queue

owner: g3 waitqueue: g2, g4

g3 releases the lock, g2 is awakened

owner: null waitqueue: g4

At this time, g5 starts to grab the lock, g2 has not been grabbed, and it is put back to the waiting queue.First

owner: g5 waitqueue: g2, g4

g6 starts to grab the lock, spinning

owner: g5 waitqueue: g2, g4 wokenFlag: 1 spinning: g6

g5 releases the lock. Since the coroutine is spinning at this time, it will not wait for the coroutine to wake up in the queue. The lock is easily grabbed by g6

owner: g6 waitqueue: g2, g4 wokenFlag: 0 spinning: null

g6 releases the lock, g2 is awakened, and g7 starts to grab the lock, g2 has not been grabbed, and it is put back to the waiting queue.First, but g2 has entered hunger mode because it has not grabbed the lock for too long

owner: g7 waitqueue: g2 (hungry), g4 starvationFlag: 1

g8 comes to grab the lock. Since it is in a hungry state, g8 will be placed directly at the end of the waiting queue

owner: g7 waitqueue: g2(hungry), g4, g8 starvationFlag: 1

g7 releases the lock, and because it is in a hungry state, it will directly wake up g2 and schedule it

owner: g2 waitqueue: g4, g8 starvationFlag: 1

g2 is executed, release the lock.Note that it's still hungry, directly dispatch g4. After g4 wakes up, he finds that it is not hungry, so clear starvationFlag

owner: g4 waitqueue: g8 starvationFlag: 0

At this time, the new g8 can be normally joined in the competition for locks, and then the normal lock-up unlocking logic is included.

A small flaw: a very marginal starvation case

Since the coroutines in the waiting queue will only determine whether to enter the starvation mode based on the waiting time after being awakened, there will be a coroutine waiting in the waiting queue for a long time. It is actually hungry, but has not been awakened, so there will be no chance to set starvationFlag, which will lead to starvation.

So will the coroutines in the waiting queue not be awakened?

some! existunlockSlow()If wokenFlag = 1, then the thread waiting in the queue will not be awakened. There will be such a situation, assuming that every timeUnlock()When there is a new coroutine spinning, the coroutine waiting in the queue will be hungry forever!

reference

Tail Latency Might Matter More Than You Think

Golang mutex internal implementation

This is the end of this article about the implementation method in golang. For more related golang content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!