SoFunction
Updated on 2025-03-05

A brief analysis of the problem of goland waiting lock

Problem description:

A URL request was sent to the background, but it was stuck. It didn't return, but it kept stuck.

Problem analysis positioning:

At first I thought it was the Internet or some other strange reason, after all, I was fine before.

Here we should think about the changes in the environment, the network, program version, or somewhere have changed.

Later I tried several times and found that the same was true. I thought of changing the number of pods to 2, so I guess I was waiting for locks or deadlocks.

Get debug information via the following link:
curl “127.0.0.1:43411/debug/pprof/goroutine?debug=1” >
curl “127.0.0.1:43411/debug/pprof/goroutine?debug=2” >

Searching for the stuck request method name, I found some, goroutine 17002 [semacquire, 5 minutes]: means that the goroutine 17002 is waiting for the lock to be acquired, and I have been waiting for 5 minutes.

There is another running method of the same name. It is estimated that everyone is waiting for its lock. It seems that the code really has a lock.
However, the top layer of this goroutine call stack is SetDataNodeCarry. When you see its method inside, you find that it has also added a lock, but it simply locks a field change.
When analyzing the lock-related codes used, everyone has no overlap. Each lock is just to ensure the atomicity and consistency of the data, so there will be no deadlock problem.

Read the code near SetDataNodeCarry, there is a for availCarryCount < needSelectedNum {}. I didn't look carefully before. This may be a dead loop. I analyzed the changes in the environment. It turns out that the loop here cannot be jumped out because of the different environments.

Summarize

When encountering such problems, you must not panic first, and then you cannot avoid the problem.

The problem needs to be recorded as much as possible for restoration, and at the same time, debugging based on the current situation. Never think that restarting will be good, and you cannot escape the problem.

The subsequent steps to solve the problem are OK, but this problem itself is not very difficult.

------------Record

1 @ 0x47fc42 0x80fb0d 0x811270 0x80b86e 0x7ba758 0x7e0686 0x7f24fa 0x81d712 0x6ff9b4 0x7018b6 0x702c88 0x6fe971 0x469581
#0x47fc41 sync.(*RWMutex).Unlock+0xb1     /usr/local/go/src/sync/:113
#0x80fb0c [go file path]m.(*Pod).SetDataNodeCarry+0x6c /go/src/[go file path]m/:777#0x81126f [go file path]+0x5f /go/src/[go file path]m/:1059#0x80b86d [go file path]m.(*t).ctcpd+0x83d /go/src/[go file path]m/:453#0x7ba757 [go file path]m.(*c).cDP+0x1e7 /go/src/[go file path]m/:558#0x7e0685 [go file path]m.(*m).cDP+0x375 /go/src/[go file path]m/handle_admin.go:353#0x7f24f9 [go file path]m.(*m).ServeHTTP+0x1659 /go/src/[go file path]m/http_server.go:188#0x81d711 [go file path]m.(*m).handlerWithInterceptor.func1+0x81 /go/src/[go file path]m/http_server.go:160#0x6ff9b3 net/+0x43    /usr/local/go/src/net/http/:1995
#0x7018b5 net/http.(*ServeMux).ServeHTTP+0x1d5    /usr/local/go/src/net/http/:2375
#0x702c87 net/+0xa7    /usr/local/go/src/net/http/:2774
#0x6fe970 net/http.(*conn).serve+0x850     /usr/local/go/src/net/http/:1878


2 @ 0x43c20f 0x44c609 0x44c5df 0x44c37d 0x47ecb9 0x7ba6ad 0x7e0686 0x7f24fa 0x81d712 0x6ff9b4 0x7018b6 0x702c88 0x6fe971 0x469581
#0x44c37c sync.runtime_SemacquireMutex+0x3c   /usr/local/go/src/runtime/:71
#0x47ecb8 sync.(*Mutex).Lock+0x108    /usr/local/go/src/sync/:134
#0x7ba6ac [go file path]m.(*c).cDP+0x13c /go/src/[go file path]m/:554#0x7e0685 [go file path]m.(*m).cDP+0x375 /go/src/[go file path]m/handle_admin.go:353#0x7f24f9 [go file path]m.(*m).ServeHTTP+0x1659 /go/src/[go file path]m/http_server.go:188#0x81d711 [go file path]m.(*m).handlerWithInterceptor.func1+0x81 /go/src/[go file path]m/http_server.go:160#0x6ff9b3 net/+0x43   /usr/local/go/src/net/http/:1995
#0x7018b5 net/http.(*ServeMux).ServeHTTP+0x1d5   /usr/local/go/src/net/http/:2375
#0x702c87 net/+0xa7   /usr/local/go/src/net/http/:2774
#0x6fe970 net/http.(*conn).serve+0x850    /usr/local/go/src/net/http/:1878
goroutine 13994 [runnable]:
sync.(*Mutex).Lock(0xc002300468)
	/usr/local/go/src/sync/:72 +0x2c9
sync.(*RWMutex).Lock(0xc002300468)
	/usr/local/go/src/sync/:93 +0x2d
[goFile path]m.(*Pod).SetDataNodeCarry(0xc0023003f0, 0x4024000000000000)
	/go/src/[goFile path]m/:774 +0x36
[goFile path](0xc003b17810, 0x2, 0x2, 0x0, 0x3)
	/go/src/[goFile path]m/:1059 +0x60
[goFile path]m.(*t).ctcpd(0xc0000ef090, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x3, 0x6, 0xc002275958, ...)
	/go/src/[goFile path]m/:453 +0x83e
[goFile path]m.(*c).cDP(0xc00223c000, 0xc00286e673, 0xe, 0xc00286e65f, 0x6, 0x0, 0x0, 0x0)
	/go/src/[goFile path]m/:558 +0x1e8
[goFile path]m.(*m).cDP(0xc0009aca90, 0xabfd80, 0xc0031a1260, 0xc00296be00)
	/go/src/[goFile path]m/handle_admin.go:353 +0x376
[goFile path]m.(*m).ServeHTTP(0xc0009aca90, 0xabfd80, 0xc0031a1260, 0xc00296be00)
	/go/src/[goFile path]m/http_server.go:188 +0x165a
[goFile path]m.(*m).handlerWithInterceptor.func1(0xabfd80, 0xc0031a1260, 0xc00296be00)
	/go/src/[goFile path]m/http_server.go:160 +0x82
net/(0xc000990300, 0xabfd80, 0xc0031a1260, 0xc00296be00)
	/usr/local/go/src/net/http/:1995 +0x44
net/http.(*ServeMux).ServeHTTP(0x10a3520, 0xabfd80, 0xc0031a1260, 0xc00296be00)
	/usr/local/go/src/net/http/:2375 +0x1d6
net/(0xc0000dad00, 0xabfd80, 0xc0031a1260, 0xc00296be00)
	/usr/local/go/src/net/http/:2774 +0xa8
net/http.(*conn).serve(0xc003b19860, 0xac1d80, 0xc0009a9e40)
	/usr/local/go/src/net/http/:1878 +0x851
created by net/http.(*Server).Serve
	/usr/local/go/src/net/http/:2884 +0x2f4


goroutine 17002 [semacquire, 5 minutes]:
sync.runtime_SemacquireMutex(0xc0026fc460, 0x419400)
	/usr/local/go/src/runtime/:71 +0x3d
sync.(*Mutex).Lock(0xc0026fc45c)
	/usr/local/go/src/sync/:134 +0x109
[goFile path]m.(*c).cDP(0xc00223c000, 0xc00002a083, 0xe, 0xc00002a06f, 0x6, 0x0, 0x0, 0x0)
	/go/src/[goFile path]m/:554 +0x13d
[goFile path]m.(*m).cDP(0xc0009aca90, 0xabfd80, 0xc0031a0620, 0xc002988900)
	/go/src/[goFile path]m/handle_admin.go:353 +0x376
[goFile path]m.(*m).ServeHTTP(0xc0009aca90, 0xabfd80, 0xc0031a0620, 0xc002988900)
	/go/src/[goFile path]m/http_server.go:188 +0x165a
[goFile path]m.(*m).handlerWithInterceptor.func1(0xabfd80, 0xc0031a0620, 0xc002988900)
	/go/src/[goFile path]m/http_server.go:160 +0x82
net/(0xc000990300, 0xabfd80, 0xc0031a0620, 0xc002988900)
	/usr/local/go/src/net/http/:1995 +0x44
net/http.(*ServeMux).ServeHTTP(0x10a3520, 0xabfd80, 0xc0031a0620, 0xc002988900)
	/usr/local/go/src/net/http/:2375 +0x1d6
net/(0xc0000dad00, 0xabfd80, 0xc0031a0620, 0xc002988900)
	/usr/local/go/src/net/http/:2774 +0xa8
net/http.(*conn).serve(0xc003b18640, 0xac1d80, 0xc00298e940)
	/usr/local/go/src/net/http/:1878 +0x851
created by net/http.(*Server).Serve
	/usr/local/go/src/net/http/:2884 +0x2f4



goroutine 14532 [semacquire, 11 minutes]:
sync.runtime_SemacquireMutex(0xc0026fc460, 0x419400)
	/usr/local/go/src/runtime/:71 +0x3d
sync.(*Mutex).Lock(0xc0026fc45c)
	/usr/local/go/src/sync/:134 +0x109
[goFile path]m.(*c).cDP(0xc00223c000, 0xc003f74303, 0xe, 0xc003f742ef, 0x6, 0x0, 0x0, 0x0)
	/go/src/[goFile path]m/:554 +0x13d
[goFile path]m.(*m).cDP(0xc0009aca90, 0xabfd80, 0xc00454a620, 0xc004544800)
	/go/src/[goFile path]m/handle_admin.go:353 +0x376
[goFile path]m.(*m).ServeHTTP(0xc0009aca90, 0xabfd80, 0xc00454a620, 0xc004544800)
	/go/src/[goFile path]m/http_server.go:188 +0x165a
[goFile path]m.(*m).handlerWithInterceptor.func1(0xabfd80, 0xc00454a620, 0xc004544800)
	/go/src/[goFile path]m/http_server.go:160 +0x82
net/(0xc000990300, 0xabfd80, 0xc00454a620, 0xc004544800)
	/usr/local/go/src/net/http/:1995 +0x44
net/http.(*ServeMux).ServeHTTP(0x10a3520, 0xabfd80, 0xc00454a620, 0xc004544800)
	/usr/local/go/src/net/http/:2375 +0x1d6
net/(0xc0000dad00, 0xabfd80, 0xc00454a620, 0xc004544800)
	/usr/local/go/src/net/http/:2774 +0xa8
net/http.(*conn).serve(0xc002e17540, 0xac1d80, 0xc002a74780)
	/usr/local/go/src/net/http/:1878 +0x851
created by net/http.(*Server).Serve
	/usr/local/go/src/net/http/:2884 +0x2f4

Summarize

This is the end of this article about the problem of goland waiting lock. For more related goland waiting lock content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!