SoFunction
Updated on 2025-03-02

Python’s operation method to create process pool using multiprocessing module

How to use PythonmultiprocessingModule creation process pool

1. Introduction

In modern computing, a key way to improve program performance is parallel processing, especially when processing large amounts of data or compute-intensive tasks, single threading may not be efficient enough. Python provides multiple modules to support parallel computing, the most commonly used one ismultiprocessingModule. It allows us to run code on multiple processors simultaneously, and handle tasks simultaneously through multiple processes, greatly improving efficiency.

This article will describe how to use PythonmultiprocessingModules, especiallyProcess Poolconcept. We will explain how to create a process pool and assign tasks on it, and code examples help you understand this important technique easily.

2. Introduction to the process pool

2.1 What is a process pool?

Process PoolIt refers to the concurrent execution of tasks through a pre-created set of processes. Generally speaking, the creation and destruction of the system is very time-consuming, so using process pools can avoid frequent creation and destruction overhead. We can submit tasks to process pools and allow them to be assigned to pre-started processes for processing.

Process pools are most commonly used for:

  • When a large number of tasks need to be executed in parallel.
  • Avoid frequent process creation and destruction.
  • Limited system resources, such as when the number of CPU cores is limited, the number of concurrent processes is limited by controlling the size of the pool.

2.2 Why use process pool?

In Python, becauseGIL (Global Interpreter Lock, Global Interpreter Lock)With the existence of , thread concurrency cannot fully utilize the advantages of multi-core in CPU-intensive tasks.multiprocessingThe module bypasses GIL limitations through multi-process means, allowing programs to take full advantage of multi-core CPUs. Using process pools makes it easier for us to manage concurrent tasks than manually creating and managing multiple processes.

Advantages of process pools include:

  • Automatically manage the creation and destruction of multiple processes.
  • Multiple tasks can be easily executed in parallel.
  • Control the number of concurrent processes through the pool size to avoid excessive resource utilization.

3. Basic knowledge of using multiprocessing module

Before you start using process pools, learn about PythonmultiprocessingThe basic concept of modules is very important.

3.1 Create and start a process

existmultiprocessingIn the module, we can passProcessClass creation and start process. A simple example is as follows:

import multiprocessing
import time
def worker(num):
    """Working functions, perform some tasks"""
    print(f"Worker {num} is starting")
    (2)  # Simulation work    print(f"Worker {num} is done")
if __name__ == '__main__':
    processes = []
    for i in range(5):
        p = (target=worker, args=(i,))
        (p)
        ()
    for p in processes:
        ()  # Wait for all processes to complete

This example demonstrates how to create multiple processes and execute tasks in parallel, but when there are many tasks, it becomes complicated to manage these processes manually. At this time, the process pool comes in handy.

4. Create a process pool and assign tasks

4.1 Basic usage of Pool class

multiprocessingIn the modulePoolClasses provide a convenient way to create process pools and assign tasks. We can submit multiple tasks to the process pool, and multiple processes in the process pool are processed simultaneously.

The following is usedPoolBasic examples of creating a process pool and executing tasks:

import multiprocessing
import time
def worker(num):
    """Working function, executing tasks"""
    print(f"Worker {num} is starting")
    (2)
    print(f"Worker {num} is done")
    return num * 2  # Return the calculation resultif __name__ == '__main__':
    # Create a process pool with 4 processes    with (processes=4) as pool:
        results = (worker, range(10))
    print(f"Results: {results}")

4.2 () Method

In the above code, we use()method. It works similar to Python built-inmap()Function, which can pass each element of an iterable object to the objective function and return the result in a list.()Tasks are automatically assigned to multiple processes in the process pool to process in parallel.

For example:

  • range(10)10 tasks were generated, each task was called onceworkerfunction.
  • Since there are 4 processes in the process pool, it executes 4 tasks in parallel at a time until all tasks are completed.

4.3 Other common methods

Apart frommap()method,PoolThere are some other commonly used methods for the class:

apply(): Execute a function synchronously until the function is executed, and the code of the main process can not be executed.

result = (worker, args=(5,))

apply_async(): Execute a function asynchronously. The main process will not wait for the function to be executed and can continue to execute other code. Suitable for processing a single task in parallel.

result = pool.apply_async(worker, args=(5,))
()  # Get the return value

starmap():similarmap(), but it allows passing multiple parameters to the objective function.

def worker(a, b):
    return a + b
results = (worker, [(1, 2), (3, 4), (5, 6)])

4.4 Setting of process pool size

When creating a process pool, we can useprocessesParameters to set the size of the process pool. Generally, the process pool size is related to the number of CPU cores of the system. You can passmultiprocessing.cpu_count()Method gets the number of CPU cores of the current system, and then sets the size of the process pool as needed.

import multiprocessing
# Get the number of CPU cores in the systemcpu_count = multiprocessing.cpu_count()
# Create a process pool, the number of processes is the same as the number of CPU corespool = (processes=cpu_count)

Setting the process pool size to the same number of CPU cores is a common choice because this makes the most of the system resources.

5. Advanced usage of process pool

5.1 Asynchronous task processing

In real scenarios, some tasks may take longer. If we do not want to wait for these tasks to complete before executing other code, we can use asynchronous task processing methods, such asapply_async(). It allows us to execute tasks in the background, while the main process can continue to execute other code, and after the task is completed, we can pass()Get the results.

import multiprocessing
import time
def worker(num):
    (2)
    return num * 2
if __name__ == '__main__':
    with (processes=4) as pool:
        results = [pool.apply_async(worker, args=(i,)) for i in range(10)]
        # Perform other actions        print("The main process continues to run")
        # Get asynchronous task results        results = [() for r in results]
        print(f"Results: {results}")

In this example, we useapply_async()Execute tasks asynchronously, while the main process can perform other operations before waiting for the task to complete. Finally we passget()Method gets the results of each task.

5.2 Exception handling

In concurrent programming, handling exceptions is very important. If an exception occurs in a process, we need to make sure that these exceptions can be caught and handled accordingly.apply_async()Providederror_callbackParameters, can be used to catch exceptions in asynchronous tasks.

def worker(num):
    if num == 3:
        raise ValueError("Simulation Error")
    return num * 2
def handle_error(e):
    print(f"Catch exceptions: {e}")
if __name__ == '__main__':
    with (processes=4) as pool:
        results = [pool.apply_async(worker, args=(i,), error_callback=handle_error) for i in range(10)]
        for result in results:
            try:
                print(())
            except Exception as e:
                print(f"主进程Catch exceptions: {e}")

In this example, if an exception is thrown by a task,error_callbackThe function will be captured and processed.

6. Practical application scenarios

6.1 CPU-intensive tasks

Multi-process parallel processing is very suitable for processingCPU intensive tasks, such as image processing, large-scale data computing, etc. In these tasks, the computational volume is very large, and multiple processes can simultaneously utilize multiple CPU cores of the system, significantly shortening processing time.

def cpu_intensive_task(n):
    total = 0
    for i in range(10**6):
        total +=
 i * n
    return total
if __name__ == '__main__':
    with (processes=4) as pool:
        results = (cpu_intensive_task, range(10))
        print(results)

6.2 IO intensive tasks

forIO intensive tasks, such as network requests, file reading and writing, since processes are waiting for external resources to respond most of the time, the concurrency performance improvement between processes may not be as obvious as CPU-intensive tasks. But it can still improve concurrency and reduce waiting time through multi-process methods.

7. Summary

Through this article, we learned how to use PythonmultiprocessingThe module creates a process pool and assigns tasks to multiple processes to execute. The use of process pools can help us effectively manage concurrent tasks and improve program execution efficiency, especially when dealing with CPU-intensive tasks.

In practice, we need to pay attention to the following points when using process pools:

  • Resource Management: Ensure the process pool is used reasonably and avoid creating too many processes that lead to insufficient system resources.
  • Task assignment: Choose the appropriate parallel processing method according to the different types of tasks (such as CPU-intensive and IO-intensive).
  • Exception handling: Catch and handle exceptions in a multi-process environment to avoid crashing the entire program due to errors in a single process.

By mastering these techniques, you can take advantage of the benefits of parallel processing in Python programming to build more efficient applications.

This is the end of this article about how to create process pools using the multiprocessing module in Python. For more related content on creating process pools in Python, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!