In Python, multi-process programming is an effective means to improve program operation efficiency. Compared with multi-threaded programming, multi-process programming can make full use of the advantages of multi-core CPUs to achieve true parallel computing. This article will explain in detail the basic concepts, usage methods and precautions of Python multi-process programming through easy-to-understand expressions and rich code cases.
1. Introduction to multi-process programming
1. What is multi-process
Multi-process programming refers to creating multiple processes in a program, each process has independent memory space and system resources, and coordinates the execution of each process through inter-process communication (IPC). This programming method can make full use of the computing power of multi-core CPUs to improve the running efficiency of the program.
2. The difference between multi-process and multi-threading
Memory independence: Each process in a multi-process has independent memory space and system resources, while multiple threads in a multi-thread share the memory space of the same process.
Execution method: Multiple processes are truly parallel execution, each process runs on an independent CPU core; while multithreads implement concurrent execution through time slice rotation on a single CPU core.
Resource overhead: The overhead of creating and destroying processes is greater because system resources need to be allocated and recycled; while the overhead of creating and destroying threads is less.
Security: Multiple processes do not interfere with each other, and are highly secure; while multi-threads share memory, which is prone to data competition and deadlock problems.
2. Multi-process programming in Python
Python provides multiprocessing module to implement multiprocessing programming. This module provides a similar interface to the threading module in the standard library, but is process-based rather than threading.
1. Create a process
In the multiprocessing module, you can use the Process class to create processes. Here is a simple example:
import multiprocessing import os import time def worker(): print(f"Worker process id: {()}") (2) print("Worker process finished") if __name__ == "__main__": print(f"Main process id: {()}") p = (target=worker) () () # Wait for the process to end print("Main process finished")
In this example, we define a worker function, then create a Process object in the main process, and specify the worker function as the objective function. Call the start method to start the process, and call the join method to wait for the process to end.
2. Inter-process communication
Inter-process communication (IPC) is an important issue in multi-process programming. Python provides a variety of ways to communicate between processes, including pipelines, queues, shared memory, etc.
Use queues for inter-process communication:
import multiprocessing import time def worker(q): (2) ("Hello from worker") if __name__ == "__main__": q = () p = (target=worker, args=(q,)) () result = () # Get the data sent by the process print(result) ()
In this example, we create a Queue object and pass it to the worker process. After the worker process completes the task, it puts the results into the queue. The main process gets the result from the queue and prints it out.
Use Pipe for inter-process communication:
import multiprocessing import time def worker(conn): (2) ("Hello from worker") () if __name__ == "__main__": parent_conn, child_conn = () p = (target=worker, args=(child_conn,)) () result = parent_conn.recv() # Receive data sent by the process print(result) ()
In this example, we use the Pipe method to create a pipeline object that returns two connection objects: parent_conn and child_conn. We pass child_conn to the worker process, and the worker process sends data through the method. The main process receives data through the parent_conn.recv method.
3. Process pool
For situations where a large number of processes are needed, using a process pool (Pool) can be more efficient. Process pool allows you to limit the number of processes running simultaneously and reuse processes.
Using process pool:
import multiprocessing import os import time def worker(x): print(f"Worker process id: {()}, argument: {x}") (2) return x * x if __name__ == "__main__": with (processes=4) as pool: # Create a process pool with 4 processes results = (worker, range(10)) # Assign tasks to processes in the process pool print(results)
In this example, we create a process pool with 4 processes and use the map method to assign tasks to processes in the process pool. The map method automatically assigns tasks to idle processes and collects results for each process.
4. Process synchronization
In multi-process programming, it is sometimes necessary to ensure that certain operations are performed in a specific order, and the process synchronization mechanism can be used. Python provides synchronization primitives such as , , etc.
Use locks (Lock):
import multiprocessing import time def worker(lock, x): with lock: # Obtain the lock print(f"Worker {x} is working") (2) print(f"Worker {x} finished") if __name__ == "__main__": lock = () processes = [] for i in range(5): p = (target=worker, args=(lock, i)) (p) () for p in processes: ()
In this example, we create a lock object and pass it to each worker process. Before performing critical operations, the worker process acquires the lock first to ensure that only one process can perform these operations at the same time.
5. Things to note
Avoid sharing of data: Try to avoid sharing data between multiple processes, as this can bring complexity and potential problems. If you really need to share data, you can use shared memory objects such as , or .
Pay attention to resource recycling: Ensure that resources are properly recycled at the end of the process, such as closing files, network connections, etc.
Avoid deadlocks: When using synchronous primitives such as locks and semaphores, be careful to avoid deadlocks. For example, make sure each process can release the lock after it acquires it.
Performance overhead: Although multiple processes can improve the running efficiency of the program, it will also bring certain performance overhead. Therefore, when deciding whether to use multi-process, you need to weigh the pros and cons.
3. Practical application cases
Here is a simple example of using multiple processes for image processing. Suppose we have a folder with multiple images that need to do some kind of processing (e.g. scaling) on each image. We can use multiple processes to increase processing speed.
import multiprocessing import os from PIL import Image def process_image(file_path, output_dir): img = (file_path) ((128, 128)) # Zoom the image img_name = (file_path) ((output_dir, img_name)) def main(input_dir, output_dir, num_processes): if not (output_dir): (output_dir) image_files = [(input_dir, f) for f in (input_dir) if (('png', 'jpg', 'jpeg'))] with (processes=num_processes) as pool: (process_image, [(img_file, output_dir) for img_file in image_files]) if __name__ == "__main__": input_dir = "path/to/input/images" output_dir = "path/to/output/images" num_processes = 4 main(input_dir, output_dir, num_processes)
In this example, we define a process_image function to handle a single image file. Then in the main function, we create a process pool and use the starmap method to assign tasks to processes in the process pool. Each process calls the process_image function to process an image file.
4. Summary
This article introduces in detail multi-process programming in Python, including basic concepts, usage methods and precautions. Through code cases and practical application scenarios, we demonstrate how to use multiple processes to improve the operation efficiency of the program.
This is the article about this article about taking you into the deep understanding of multi-process programming in Python. For more related content on Python multi-process programming, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!