A detailed explanation of how to troubleshoot and locate deadlocks in Java

1. Service deadlock, Linux problems

In today's digital age, microservice architecture has become the preferred architectural model for many enterprises to build large application systems with its advantages such as high scalability, flexibility and ease of maintenance. When we deploy microservices on Linux servers, we sometimes encounter headaches of deadlock problems. Once a deadlock occurs, it is like pressing the "pause button" for the operation of a microservice, which will cause the service to fail to respond normally, seriously affecting the availability and stability of the system, and thus have adverse effects on the business.

For example, in an e-commerce system, order microservices and inventory microservices may access shared database resources at the same time. If the order microservice first acquires the lock of the order table when processing the order, and then tries to acquire the lock of the inventory table to update the inventory; and the inventory microservice first acquires the lock of the inventory table when processing the inventory adjustment, and then tries to acquire the lock of the order table to associate the order information. When these two operations are executed concurrently, deadlocks may occur, resulting in orders that cannot be created and inventory cannot be updated. Users will wait while placing an order, which seriously affects the shopping experience.

Faced with such a dilemma, it is particularly important to quickly and accurately detect and solve deadlock problems. andJPS(Java Virtual Machine Process Status Tool) andJstackThe (Java Stack Trace) command is like two sharp "swords", providing strong support for us to troubleshoot microservice deadlocks in Linux environments. Next, let's take a deeper look at how to use these two commands to troubleshoot deadlock problems.

2. Deadlock Revealing: Cause and Scene Analysis

(I) Causes of deadlock

Insufficient system resources: The resources in the system are limited. When multiple threads or processes compete for these limited resources, if the number of resources cannot meet the needs of all threads or processes, it may lead to deadlocks. For example, in a multi-threaded database application, multiple threads request database connection resources at the same time. If the number of connections in the database connection pool is limited, when all connections are occupied, new threads request connections will be blocked. If these threads are waiting for connections and hold other resources and do not release them, it may cause deadlocks.
Improper process progression order: During the process's running, the order of requesting and releasing resources is unreasonable, which will also lead to deadlocks. For example, thread A first obtains resource X and then tries to obtain resource Y; while thread B first obtains resource Y and then tries to obtain resource X. If these two threads execute concurrently, they will wait for each other, resulting in a deadlock.
Improper resource allocation: Unreasonable resource allocation algorithm or errors occur during resource allocation, it may also cause deadlocks. For example, in a distributed system, processes on different nodes do not effectively coordinate the allocation of shared resources, resulting in some processes obtaining too many resources while other processes cannot obtain the necessary resources, which in turn causes deadlocks.

(II) Necessary conditions for the occurrence of deadlock

Mutual Exclusion Conditions: refers to a resource that can only be used by one thread or process at a certain moment. If other threads or processes want to use the resource, they must wait for it to be released. For example, when a printer prints a task, it can only serve one process at the same time, and other processes need to wait for the printer to complete the current task before it can be used.
Request and hold conditions: A thread or process will maintain possession of the acquired resources while requesting new resources. For example, thread A has obtained resource X. When requesting resource Y, it will not release resource X. If resource Y is occupied by other threads, thread A will be blocked, but it will still hold resource X.
No deprivation of conditions: The resource obtained by a thread or process cannot be forcibly deprived of by other threads or processes before it is used up, and can only be released by the thread or process holding the resource itself. For example, a thread obtains a file write lock, and other threads cannot forcefully acquire the write lock until it completes the write operation and releases the lock.
Loop waiting conditions: A loop waiting resource relationship that is connected head-to-tail between multiple threads or processes. For example, thread A waits for thread B to release resource Y, thread B waits for thread C to release resource Z, and thread C waits for thread A to release resource X, thus forming a loop waiting loop, resulting in deadlock.

(III) Scenarios of deadlock appearing

Multiple threads apply for each other's resources: This is one of the most common deadlock scenarios. Suppose there are two threads T1 and T2, T1 holds resource R1, and then tries to obtain resource R2 held by T2; at the same time, T2 holds resource R2, and tries to obtain resource R1 held by T1. Both sides are waiting for the other party to release the resources they need, so they fall into a deadlock. For example, in a graphic drawing program, thread T1 is responsible for drawing the outline of the figure, holding the brush resource R1, and needs to obtain the pigment resource R2 when drawing the fill color; while thread T2 is responsible for filling the color, holding the pigment resource R2, and needs to obtain the brush resource R1 when drawing the outline. If they are executed at the same time, a deadlock may occur.
A deadlock occurs when a single thread applies for a new resource: When a thread already holds some resources, when applying for new resources, if the new resource is occupied by other threads and the thread does not release the already held resources, it may cause deadlock. For example, when a thread processes a transaction, it has acquired part of the locks of the database. When it needs to acquire more locks to complete the transaction, since other threads hold these locks, the thread will fall into waiting, and at the same time it does not release the already held lock, causing a deadlock.

3. Tools debut: Introduction to JPS and Jstack

(I) Detailed explanation of JPS command

The JPS command is a tool provided by the Java Development Kit (JDK). Its main purpose is to list information about JVM processes (Java virtual machine processes).. When troubleshooting service deadlocks, it is an important means for us to obtain the target Java process ID. When developing and debugging Java applications, the JPS command can display the process ID (PID) of the running Java program and other related information, such as the full class name of the program, that is, the Java main class name.

The basic syntax of JPS commands is:jps [ options ] [ hostid ] , where the option parameter is used to specify different options, and the hostid parameter is used to specify the remote host to query. If you do not specify any options, execute the jps command directly, which will list all Java process IDs and corresponding main class names in the current system. For example, if you open a terminal in a Linux system, enter the directory where the project is located, and execute the jps command, you may get the following output:

12345 MainClass
12346 Jps

In the above output, 12345 is the Java process ID that runs MainClass, and 12346 is the process ID that currently executes the jps command.

Common options are:

-l: Display the complete package name and application main class name. For example, if you execute jps -l, the output may be:

12345 
12346

This way we can see the full class name corresponding to the Java process more clearly.

-m: Displays the complete package name, application main class name and virtual machine startup parameters. Execute jps -m, the output example is as follows:

12345  --param1 value1 --param2 value2
12346  -=/usr/local/jdk1.8.0_291 -Xms8m

Through this option, we can understand the parameters passed in when the Java process starts.

-v: Displays the startup parameters of the virtual machine and the JVM command line options. Execute jps -v , the output may be:

12345  -Xmx512m -Xms256m -XX:MaxPermSize=256m
12346  -=/usr/local/jdk1.8.0_291 -Xms8m

This helps us view the JVM configuration parameters of the Java process.

-q: Only display the process ID, not the class name and main class name. Execute jps -q, and the output result is similar:

12345
12346

This method is very simple and efficient when you only need to obtain the process ID.

(II) Detailed explanation of Jstack command

Jstack is a stack trace tool that comes with Java virtual machines. Its main purpose is to generate thread snapshots of the current moment of the Java virtual machine. Thread snapshot is a collection of method stacks that each thread in the current Java virtual machine is executing. By analyzing this snapshot, we can locate the reasons for the thread's long pauses, such as deadlocks between threads, dead loops, and long waits caused by requesting external resources. When troubleshooting microservice deadlock problems, the Jstack command plays a key role. It can help us gain insight into the running state of threads and method calls, and thus find out the root cause of the deadlock.

The basic syntax of the Jstack command is: jstack [ options ] pid , where options is an optional parameter and pid is the Java process ID to be analyzed. Common options are:

-l: Long listing will print out additional lock information. When a deadlock occurs, you can use jstack -l pid to observe the lock holding situation. When we suspect a deadlock in a microservice, use this option to obtain more detailed lock-related information, such as:

jstack -l 12345

After executing the above command, the output result will contain detailed information about the lock held by each thread and the lock waiting to be acquired, which is very helpful in determining whether there is a deadlock and the specific situation of the deadlock.

-m: mixed mode, which not only outputs Java stack information, but also outputs C/C++ stack information (such as Native method). If a local method is called in a Java program, use this option to view the stack information of the local method, which is helpful for comprehensive analysis of the problem. The command example is as follows:

jstack -m 12345

-F: Force a stack dump when jstack [-l] pid does not respond. When a normal request is not responded, the stack information is forced to be output. In some cases, the target Java process may be in an unresponsive state, and using the -F option can force the thread stack information, for example:

jstack -F 12345

The thread stack information obtained through the Jstack command contains rich content, such as the thread's status (RUNNABLE, BLOCKED, WAITING, etc.), the method being executed by the thread, the method call stack, and the lock holding and waiting situation. This information is crucial for us to troubleshoot deadlock problems and can help us accurately locate where and why the deadlock occurs.

4. Practical drill: Check deadlock steps

(I) Reproducing deadlock: Java code example

Below is a Java code example that causes deadlocks. This code allows you to clearly see how threads compete for resources and ultimately lead to deadlocks.

public class DeadLockExample {
    private static final Object lock1 = new Object();
    private static final Object lock2 = new Object();

    public static void main(String[] args) {
        Thread thread1 = new Thread(() -&gt; {
            synchronized (lock1) {
                ("Thread 1: Holding lock1");
                try {
                    (1000); // Let thread 1 hold lock1 for a period of time to ensure that thread 2 has a chance to get lock2                } catch (InterruptedException e) {
                    ();
                }
                ("Thread 1: Waiting for lock2");
                synchronized (lock2) {
                    ("Thread 1: Holding lock1 and lock2");
                }
            }
        });

        Thread thread2 = new Thread(() -&gt; {
            synchronized (lock2) {
                ("Thread 2: Holding lock2");
                try {
                    (1000); // Let thread 2 hold lock2 for a period of time to ensure that thread 1 has a chance to get lock1                } catch (InterruptedException e) {
                    ();
                }
                ("Thread 2: Waiting for lock1");
                synchronized (lock1) {
                    ("Thread 2: Holding lock1 and lock2");
                }
            }
        });

        ();
        ();
    }
}

In this code, thread1 first gets lock1 and then sleeps for 1 second, during which thread2 has the chance to get lock2. Next, thread1 tries to obtain lock2 , while thread2 tries to obtain lock1 , because both parties hold the resources needed by the other party and do not release it, thus forming a deadlock.

(II) Use JPS to find the process ID

Pack the above Java code into an executable JAR file and deploy it to a Linux server to run. In a Linux terminal, use the jps command to find the running Java process ID. Suppose we name this JAR file , and run the command as follows:

java -jar

After running, open a new terminal and execute the jps command:

jps

The output may be as follows:

12345 DeadLockExample
12346 Jps

The 12345 here is the Java process ID that runs the DeadLockExample class. This ID is required to troubleshoot deadlocks in the future.

(III) Use Jstack to analyze thread stack

After obtaining the Java process ID, use the jstack command to analyze the thread stack information to locate the deadlock. Execute the command as follows:

jstack -l 12345

Where the -l option means outputting additional lock information, which is very helpful for analyzing deadlocks. After the command is executed, a large amount of thread stack information will be output, and we focus on the parts related to deadlock. Here are the possible outputs (simplified to highlight the key points):

Found one Java-level deadlock:
=============================
"Thread-1":
  waiting to lock monitor 0x00007f85a8003ae8 (object 0x00000007d6aa2c98, a ),
  which is held by "Thread-0"
"Thread-0":
  waiting to lock monitor 0x00007f85a8006168 (object 0x00000007d6aa2ca8, a ),
  which is held by "Thread-1"

Java stack information for the threads listed above:
===================================================
"Thread-1":
        at $main$1(:22)
        - waiting to lock <0x00000007d6aa2c98> (a )
        - locked <0x00000007d6aa2ca8> (a )
        at (:748)

"Thread-0":
        at $main$0(:12)
        - waiting to lock <0x00000007d6aa2ca8> (a )
        - locked <0x00000007d6aa2c98> (a )
        at (:748)

Found 1 deadlock.

From the output, we can see that Thread-1 is waiting to acquire the lock held by Thread-0 (0x00000007d6aa2c98 ), while Thread-0 is waiting to acquire the lock held by Thread-1 (0x00000007d6aa2ca8 ), which forms a deadlock. At the same time, you can also see specific lines of code that occurs when the deadlock occurs, such as :22 and :12, which provides us with a key clue for further troubleshooting and solving the deadlock problem.

5. Summary and Outlook

During this deadlock investigation, we first reproduced the deadlock problem through a simple Java code example, and then quickly and accurately obtained the target Java process ID with the help of JPS commands, laying the foundation for subsequent analysis work. Then, the thread stack was analyzed using the Jstack command and the key information of the deadlock was successfully found, including the deadlock thread, the held lock, and the lock waiting to be acquired, so as to clearly locate the source of the deadlock.

The deadlock problem is extremely harmful to the normal operation of the system. It will not only cause service interruption and affect user experience, but may also cause waste of resources and degradation of system performance. Therefore, it is crucial to avoid deadlocks when developing and deploying services. To prevent deadlocks, we can take a variety of measures. During the code writing stage, it is necessary to ensure that all threads acquire locks in the same order, avoid the use of nested locks, and reduce the lock holding time. In terms of resource allocation, rationally plan the use and allocation of resources to avoid excessive competition and unreasonable allocation of resources. At the same time, some advanced synchronization tools can be used, such as ReentrantLock, Semaphore, etc., which provide more flexible synchronization control and help reduce the risk of deadlocks. In addition, it is also very necessary to conduct regular performance testing and deadlock detection on the system to promptly detect and resolve potential deadlock problems.

With the continuous development of architecture and the increasing complexity of application scenarios, deadlock problems may arise in more concealed and complex forms. In the future, we need to constantly explore and study new deadlock detection and prevention technologies, combine emerging technologies such as artificial intelligence and big data analysis to achieve intelligent prediction and automatic processing of deadlock problems, and further improve the stability and reliability of microservice systems.

The above is a detailed explanation of how to troubleshoot and locate deadlocks in Java. For more information about troubleshooting and locate Java deadlocks, please pay attention to my other related articles!