What is multithreading
A thread is the smallest unit that can perform operations in the operating system. It is contained in a process. A process can have multiple threads, which means that multiple threads can be concurrently transmitted in a process, that is, multi-threads.
For a python program, if you need to process multiple tasks at the same time, there are two methods: multi-process and multi-threading. In python, multi-threading is mainly implemented through the threading module, while multi-processing module is mainly implemented through the multiprocessing module.
The main difference between these two modules is: the threading module is based on threads, while the multiprocessing module is based on process. The threading module uses shared memory to implement multi-threading, and all threads share the same variables (this can be felt in subsequent instances); while multiprocessing is based on child processes, and each child process has independent variables and data structures. The difference between the two means that threading is more used for I/O-intensive tasks (for example, multi-table reading operations are required), and the multiprocessing module is more suitable for CPU-intensive tasks (matrix operations, picture processing tasks) that contain more calculations.
It should be noted that due to the existence of GIL locks in python, the Python interpreter only allows one Python process to be used, which means that only one process is allowed to run for an interpreter. This is why the threading module cannot be used for CPU-intensive tasks such as CPU intensive, because a process's CPU resources are limited, no matter how many threads are opened, the total resources are only those, and the total time will not change much. The multiprocessing module can open multiple processes and can handle CPU-intensive tasks more quickly.
I won't go into the details about the GIL lock and Multiprocessing module. This time, I will mainly introduce how to use the threading module to implement multi-threading.
Thread full life cycle
The full life cycle of a thread includes new-ready-run-blocking-death.
- New: that is, create a new thread object
- Ready: After calling the start method, the thread object waits for running. When it starts running depends on the scheduling
- Run: The thread is in running state
- Blocking: The thread in the running state is blocked. In general, it is stuck. Possible reasons include but are not limited to the program itself calling the sleep method to block the thread's running, or calling a blocking I/O method. The blocked process will wait for when to unblock and run again.
- Death: The thread is executed or exits abnormally, the thread object is destroyed and the memory is released
Main thread and child thread
The multithreading we are talking about actually means running multiple child threads only in the main thread, and the main thread is the thread executed by our python compiler. All child threads and main thread belong to the same process. When no child thread is added, there is only one main thread running by default. It will execute the code we wrote from the beginning to the end. We will also mention the relationship between the main thread and the child thread in the following text.
Let’s not talk about so many concepts, let’s get to the topic right now!
Example 1 - Create thread object directly using Thread
The basic syntax for creating a new thread in the Thread class is as follows:
Newthread= Thread(target=function, args=(argument1,argument2,...))
- Newthread: Created thread object
- function: The function to execute
- argument1,argument2: The argument passed to the thread function, of type tuple
Suppose a task task (of course, task can be replaced with any other task, which is only a hypothesis in this example), the function of this task is to print a letter every 1s. We use two child threads to print different letters a and b at the same time, as shown below:
""" <case1: Create threads directly using the Thread class in threading> """ from threading import Thread import time from time import sleep # Custom functions, which can be replaced with any other functiondef task(threadName, number, letter): print(f"【Thread start】{threadName}") m = 0 while m < number: sleep(1) m += 1 current_time = ('%H:%M:%S', ()) print(f"[{current_time}] {threadName} Output {letter}") print(f"【Thread ends】{threadName}") thread1 = Thread(target=task, args=("thread_1", 4, "a")) # Thread 1: Execute the task and print 4 athread2 = Thread(target=task, args=("thread_2", 2, "b")) # Thread 2: Execute the task and print 2 b() # Thread 1 starts() # Thread 2 starts() # Wait for thread 1 to end() # Waiting for thread2Finish
Its output is:
【Thread start】thread_1
【Thread start】thread_2
[13:42:00] thread_1 output a
[13:42:00] thread_2 output b
[13:42:01] thread_1 output a
[13:42:01] thread_2 output b
[Thread End] thread_2
[13:42:02] thread_1 output a
[13:42:03] thread_1 output a
[Thread End] thread_1
Thread thread1 and thread2 start at the same time, thread2 prints 2 bs and ends, while thread1 continues to print a until it is completed.
Example 2 - Block threads using join
In the previous instance, we can see that there are two statements at the end () and (). These two statements appear at the end, which means that the main thread will wait for all the child threads to be executed. Of course, since the child thread we create by default is the foreground thread (this concept will be mentioned later), if there is no join statement, the main thread will wait for all the child threads to be executed before exiting.
The join method can be used to block the order of execution of the main thread. Therefore, in the main thread, the execution order of each child thread can be adjusted. After understanding these, let's look at the next example.
""" <case2: Blocking process using join method> """ from threading import Thread import time from time import sleep # Custom functions, which can be replaced with any other functiondef task(threadName, number, letter): print(f"【Thread start】{threadName}") m = 0 while m < number: sleep(1) m += 1 current_time = ('%H:%M:%S', ()) print(f"[{current_time}] {threadName} Output {letter}") print(f"【Thread ends】{threadName}") thread1 = Thread(target=task, args=("thread_1", 6, "a")) # Thread 1: Assume that the task is to print 6 athread2 = Thread(target=task, args=("thread_2", 4, "b")) # Thread 2: Assume that the task is to print 4 bthread3 = Thread(target=task, args=("thread_3", 2, "c")) # Thread 3: Assume that the task is to print 2 cs() # Thread 1 starts() # Task 2 starts() # Waiting for thread 2() # Thread 2 only starts after the task is completed() # Wait for thread 1 to complete the thread() # Waiting for thread3Complete thread
Its output is:
【Thread start】thread_1
【Thread start】thread_2
[13:44:20] thread_2 output b
[13:44:20] thread_1 output a
[13:44:21] thread_2 output b
[13:44:21] thread_1 output a
[13:44:22] thread_2 output b
[13:44:22] thread_1 output a
[13:44:23] thread_2 output b
[Thread End] thread_2
[13:44:23] thread_1 output a
【Thread start】thread_3
[13:44:24] thread_3 output c
[13:44:24] thread_1 output a
[13:44:25] thread_1 output a
[13:44:25] thread_3 output c
[Thread End] thread_3
[Thread End] thread_1
As can be seen from the output, the main process is waiting for the thread2 thread to complete the task, so thread3 does not start the task until thread thread2 ends.
Since here thread1 prints 6 a, thread2 prints 4 b, and thread3 prints 2 c. The workload of thread1 is equal to the sum of thread2+thread3's workload, so the entire program can be regarded as thread1 and thread2+thread3 running in parallel.
Example 3 - Rewrite the parent class creation thread
In Examples 1 and 2, we have already introduced how to directly import Thread function to create threads and how to use the join method, but this method of creating threads essentially uses the default settings of its parent class and has limitations. In Example 3, we will further explore how to inherit and override the parent class to create child threads.
The same as Example 2, we assume that multiple threads need to be used to process task1, thread1 prints 4 letters of a (4s) and thread2 thread prints 2 letters of b (2s) as follows:
""" <case3: Rewrite the parent class creation thread> """ import threading import time from time import sleep # myThread inherits the parent class and rewrites itclass myThread(): # Rewrite the constructor of the parent class def __init__(self, number, letter): .__init__(self) = number # Add number variable = letter # Add letter variable # Rewrite the run function in the parent class def run(self): print(f"【Thread start】{}") task1(, , ) print("【Thread End】", ) # Rewrite the parent class destructor def __del__(self): print("【Thread destruction and release memory】", ) # Custom functions, which can be replaced here with any other task that you want to execute multithreadeddef task1(threadName, number, letter): m = 0 while m < number: sleep(1) m += 1 current_time = ('%H:%M:%S', ()) print(f"[{current_time}] {threadName} Output {letter}") # def task2... # def task3... thread1 = myThread(4, "a") # Create thread thread1: The task takes 2sthread2 = myThread(2, "b") # Create thread thread2: The task takes 4s() # Start thread 1() # Start thread 2() # Wait for thread 1() # Waiting for thread2
The output is:
【Thread start】Thread-1
【Thread start】Thread-2
[10:37:58] Thread-1 output a
[10:37:58] Thread-2 output b
[10:37:59] Thread-1 output a
[10:37:59] Thread-2 output b
【Thread End】Thread-2
[10:38:00] Thread-1 output a
[10:38:01] Thread-1 output a
【Thread End】Thread-1
【Thread destruction and release memory】Thread-1
[Thread destruction and release memory] Thread-2
From the output, we can clearly see the entire process of two parallel tasks from the beginning to the end, and finally destroying and releasing memory together, which well reflects a complete life cycle process of the thread.
The final effect is the same as that achieved in Example 1, but using the method of rewriting the parent class in inheritance allows us to more freely define various parameters and define tasks for thread processing, and also allows us to have a deeper understanding of the threading module.
Example 4 - Foreground thread and background thread (daemon thread)
In all the previous instances, the daemon parameter we ignored, which defaults to False, which means that the thread is a foreground thread by default.
The foreground thread indicates that the entire program exits when all foreground threads have been executed. Setting the daemon parameter to True means that the thread is a background thread. At this time, when the main process ends, all background threads that have not completed execution will end automatically.
Based on the previous example, add =True to the initialization part, remove the end join method, and replace it with the sleep method to block the main program's running. Let's see what the result will be like, as shown below:
""" <case4: Foreground thread and background thread> """ import threading import time from time import sleep # myThread inherits the parent class and rewrites itclass myThread(): # Rewrite the constructor of the parent class def __init__(self, number, letter): .__init__(self) = number # Add number variable = letter # Add letter variable = True # Default foreground thread # Rewrite the run function in the parent class def run(self): print(f"【Thread start】{}") task1(, , ) print("【Thread End】", ) # Rewrite the parent class destructor def __del__(self): print("【Thread destruction and release memory】", ) # Custom functions, which can be replaced here with any other task that you want to execute multithreadeddef task1(threadName, number, letter): m = 0 while m < number: sleep(1) m += 1 current_time = ('%H:%M:%S', ()) print(f"[{current_time}] {threadName} Output {letter}") # def task2... # def task3... thread1 = myThread(4, "a") # Create thread thread1: Assuming the task takes 2sthread2 = myThread(2, "b") # Create thread thread2: Assuming the task takes 4s() # Start thread 1() # Start thread 2(3) # Main program waiting3sContinue to execute
Its output will become:
【Thread start】Thread-1
【Thread start】Thread-2
[10:31:45] Thread-1 output a
[10:31:45] Thread-2 output b
[10:31:46] Thread-1 output a
[10:31:46] Thread-2 output b
【Thread End】Thread-2
Process finished with exit code 0
We used the sleep method to force block the main program for 3s, but since we set the thread as a background thread, after 3s, the main program will be executed. At this time, the two child threads thread1 and thread2 will be forced to end regardless of whether the execution is completed or not.
Set the daemon parameter to False, and its output is the same as Example 3, as follows:
【Thread start】Thread-1
【Thread start】Thread-2
[10:30:14] Thread-1 output a
[10:30:14] Thread-2 output b
[10:30:15] Thread-1 output a
[10:30:15] Thread-2 output b
【Thread End】Thread-2
[10:30:16] Thread-1 output a
[10:30:17] Thread-1 output a
【Thread End】Thread-1
【Thread destruction and release memory】Thread-1
[Thread destruction and release memory] Thread-2
Example 5 - Thread synchronization (thread lock)
Let's imagine this situation. When multiple threads execute at the same time, due to the sharing of variables and data structures of threads in the threading module, multiple threads may modify one data at the same time, which is absolutely not possible.
In order to synchronize individual threads, we introduce the concept of thread lock. When a thread accesses data, lock it first. If other threads want to access this data, they will be blocked until the previous thread is unlocked and released. In the threading module, locking and releasing locks mainly uses the Lock class, using the acquire() and release() methods:
Lock = () # Obtain lock class in threading module() # Set lock() # Release the lock
When introducing thread lock instances, we will not use the task of printing letters used in the previous instances. In order to let you understand the role of thread lock more intuitively, we use multi-threading to delete and modify a list of lists.
Suppose that multiple threads need to modify this list at this time, as follows:
""" <case5: thread synchronization, thread lock> """ import threading import time # Subclass myThread inherits the parent class and rewrites itclass myThread(): # Rewrite the parent class constructor def __init__(self, number): .__init__(self) = number # Rewrite the parent class run function and automatically call the run function when calling start() def run(self): print(f"【Thread start】{}") () # Set thread lock edit_list(, ) () # Release thread lock # Rewrite the parent class destructor def __del__(self): print("【Thread Destruction】", ) # Custom task functionsdef edit_list(threadName, number): while number > 0: (1) data_list[number-1] += 1 current_time = ('%H:%M:%S', ()) print(f"[{current_time}] {threadName} Revisedatalistfor{data_list}") number -= 1 print(f"【Thread{threadName}Complete the work】") data_list = [0, 0, 0, 0] Lock = () # Create 3 child threadsthread1 = myThread(1) thread2 = myThread(2) thread3 = myThread(3) # Start 3 child threads() () () # The main process waits for all threads to complete() () () print("【Main process ends】")
The output is:
【Thread start】Thread-1
【Thread start】Thread-2
【Thread start】Thread-3
[09:55:22] Thread-1 Modify datalist to [1, 0, 0, 0]
[Thread-1 completes the work]
[09:55:23] Thread-2 Modify datalist to [1, 1, 0, 0]
[09:55:24] Thread-2 Modify datalist to [2, 1, 0, 0]
[Thread-2 completes the work]
[09:55:25] Thread-3 Modify datalist to [2, 1, 1, 0]
[09:55:26] Thread-3 Modify datalist to [2, 2, 1, 0]
[09:55:27] Thread-3 Modify datalist to [3, 2, 1, 0]
[Thread-3 completes the work]
【Main process ends】
【Thread Destruction】Thread-1
【Thread Destruction】Thread-2
【Thread Destruction】Thread-3
When all three threads need to use the same data, we only need to lock and release the lock in the thread's run method. At this time, the three child threads will perform sequential operations. The next thread will continue to execute after the previous child thread has completed the release of the lock. It should be noted that the three child threads need to use the same lock.
There are many optional parameters and methods available for use in the threading module. For details, please refer to the official documentation of the threading module.
Click the link:threading --- Thread-based parallelism — Python 3.12.3 Documentation
The above is the detailed explanation of the Python multi-threading module example. For more information about the Python threading module, please pay attention to my other related articles!