SoFunction
Updated on 2024-10-30

Example analysis of python implementation of multi-process communication

The operating system allocates a separate address space for each process created, and the address spaces of different processes are completely isolated, so that they don't feel each other's presence at all if no other measures are added. So how do processes communicate with each other? How do they relate to each other? What is the principle of implementation? In this article, we will use Python to talk about the communication between processes? Or that is to say, the principle is the same, I hope to be able to experience the essence of things through specific examples.

The following describes each type of communication in as simple a manner as possible, and you can refer to the documentation for specific details on how to use it;

1. Pipelines

Let's start with the simplest and oldest type of IPC: pipes. Usually referred to as nameless pipes, they can essentially be thought of as a kind of file that exists only in memory and is not stored on disk. Different processes read or write data to the pipe through the interfaces provided by the system.

That is, we provide a way for processes to communicate through such an intermediate medium. The limitation of nameless pipes is that they are generally only used for parent-child processes that are directly related. Here's a simple example of how it's used.

from multiprocessing import Process, Pipe

def pstart(pname, conn):
  ("Data@subprocess")
  print(())     # Data@parentprocess

if __name__ == '__main__':
  conn1, conn2 = Pipe(True)
  sub_proc = Process(target=pstart, args=('subprocess', conn2,))
  sub_proc.start()
  print (())    # Data@subprocess
  ("Data@parentprocess")
  sub_proc.join()

Pipeline Communications in Three Steps:

  1. Create Pipe and get two connection objects conn1 and conn2;
  2. The parent process holds conn1 and passes conn2 to the child process;
  3. Parent and child processes pass and receive data by performing send and recv operations on the connection objects they hold;

Above we created a full-duplex pipeline, you can also create a half-duplex pipeline, you can refer to the official website to describe the specific use:

Returns a pair (conn1, conn2) of Connection objects representing the ends of a pipe.

If duplex is True (the default) then the pipe is bidirectional. If duplex is False then the pipe is unidirectional: conn1 can only be used for receiving messages and conn2 can only be used for sending messages.

2. Named pipes (FIFO)

The pipes described above are mainly used for processes that have a direct relationship with each other and are more limited. Here is a look at named pipes that can communicate between arbitrary processes.

Since the os module does not have the mkfifo attribute on the window platform, this example can only be run on linux (test environment CentOS 7, Python 2.7.5):

#!/usr/bin/python
import os, time
from multiprocessing import Process

input_pipe = "./"
output_pipe = "./"

def consumer():
  if (input_pipe):
    (input_pipe)
  if (output_pipe):
    (output_pipe)

  (output_pipe)
  (input_pipe)
  in1 = (input_pipe, os.O_RDONLY)    # read from 
  out1 = (output_pipe, os.O_SYNC | os.O_CREAT | os.O_RDWR)
  while True:
    read_data = (in1, 1024)
    print("received data from : %s @consumer" % read_data)
    if len(read_data) == 0:
      (1)
      continue

    if "exit" in read_data:
      break
    (out1, read_data)
  (in1)
  (out1)

def producer():
  in2 = None
  out2 = (input_pipe, os.O_SYNC | os.O_CREAT | os.O_RDWR)

  for i in range(1, 4):
    msg = "msg " + str(i)
    len_send = (out2, msg)
    print("------product msg: %s by producer------" % msg)
    if in2 is None:
      in2 = (output_pipe, os.O_RDONLY)    # read from 
    data = (in2, 1024)
    if len(data) == 0:
      break
    print("received data from : %s @producer" % data)
    (1)

  (out2, 'exit')
  (in2)
  (out2)

if __name__ == '__main__':
  pconsumer = Process(target=consumer, args=())
  pproducer = Process(target=producer, args=())
  ()
  (0.5)
  ()
  ()
  ()

The run flow is as follows:

The process for each round is as follows:

  1. The producer process writes message data to the file;
  2. The consumer process reads message data from a file;
  3. The consumer process writes return message data to the file;
  4. The producer process reads the return message data from the file;

The results are as follows:

[shijun@localhost python]$ python 
------product msg: msg 1 by producer------
received data from : msg 1 @consumer
received data from : msg 1 @producer
------product msg: msg 2 by producer------
received data from : msg 2 @consumer
received data from : msg 2 @producer
------product msg: msg 3 by producer------
received data from : msg 3 @consumer
received data from : msg 3 @producer
received data from : exit @consumer

There is no direct relationship between the two processes; each process has a read file and a write file, and the two processes can communicate if their read and write files are related.

3. Message queues (Queues)

Message data is passed between processes by adding data to or getting data from a queue. Here is a simple example.

from multiprocessing import Process, Queue
import time

def producer(que):
  for product in ('Orange', 'Apple', ''):
    print('put product: %s to queue' % product)
    (product)
    (0.5)
    res = ()
    print('consumer result: %s' % res)

def consumer(que):
  while True:
    product = ()
    print('get product:%s from queue' % product)
    ('suc!')
    (0.5)
    if not product:
      break

if __name__ == '__main__':
  que = Queue(1)
  p = Process(target=producer, args=(que,))
  c = Process(target=consumer, args=(que,))
  ()
  ()
  ()
  ()

This example is relatively simple, and you can refer to the official website for the specific usage of queue.

Results:

put product: Orange to queue
consumer result: suc!
put product: Apple to queue
consumer result: suc!
put product: to queue
consumer result: suc!
get product:Orange from queue
get product:Apple from queue
get product: from queue

Here are a few things to keep in mind:

  1. You can specify the capacity of the queue, if the capacity is exceeded there will be an exception: raise Full;
  2. By default, both put and get block the current process;
  3. If put is not set to blocking, then it may take the data it put in from the queue itself;

4. Shared memory

Shared memory is a common and efficient way for processes to communicate with each other. In order to ensure orderly access to shared memory, additional synchronization measures need to be taken for processes.

The following example simply demonstrates how shared memory can be used to communicate between different processes in Python.

from multiprocessing import Process
import mmap
import contextlib
import time

def writer():
  with ((-1, 1024, tagname='cnblogs', access=mmap.ACCESS_WRITE)) as mem:
    for share_data in ("Hello", "Alpha_Panda"):
      (0)
      print('Write data:== %s == to share memory!' % share_data)
      ((share_data))
      ()
      (0.5)

def reader():
  while True:
    invalid_byte, empty_byte = ('\x00'), ('')
    with ((-1, 1024, tagname='cnblogs', access=mmap.ACCESS_READ)) as mem:
      share_data = (1024).replace(invalid_byte, empty_byte)
      if not share_data:
        """ End reader when shared memory has no valid data """
        break
      print("Get data:== %s == from share memory!" % share_data.decode())
    (0.5)


if __name__ == '__main__':
  p_reader = Process(target=reader, args=())
  p_writer = Process(target=writer, args=())
  p_writer.start()
  p_reader.start()
  p_writer.join()
  p_reader.join()

Implementation results:

Write data:== Hello == to share memory!
Write data:== Alpha_Panda == to share memory!
Get data:== Hello == from share memory!
Get data:== Alpha_Panda == from share memory!

Here's a brief explanation of how shared memory works;

A mapping off a process virtual address to a physical address is as follows:

The diagram above already shows the principle of shared memory quite clearly.

On the left is the normal case where the linear address spaces of different processes are mapped to different pages of physical memory, so that no matter how other processes modify physical memory, they will not be affected;

The right-hand side represents the case of process-shared memory, where some of the linear addresses of different processes are mapped to the same physical page, and changes made to this physical page by one process are immediately visible to the other process;

The potential problem of course is to take process synchronization measures, i.e. accesses to shared memory must be mutually exclusive. This can be achieved with the help of signals.

5. Socket communications

Finally, one more type of inter-process communication that can be used across hosts: sockets.

Anyone who knows network programming should be familiar with this. sockets can not only communicate across hosts, but can even sometimes be used to communicate between different processes on the same host.

This part of the code is relatively simple and common, here just use the flow chart to indicate the flow of socket communication and related interfaces.

The above figure represents a flow of socket communication between a process on the client using a socket and a listening program on the server.

wrap-up

Here on the common inter-process communication related concepts and examples are briefly introduced. Hopefully, this article will give you a deeper understanding and awareness of inter-process communication.

Combined with a few previous introduction to the concept of threads, processes and some measures of synchronization between threads, I believe that there should be a simple and clear understanding of the concepts related to threads and processes.

This is the whole content of this article.