Python Concurrent Programming: Detailed Explanation of Multiprocessing, Multithreading, Asynchronous, and Coroutines

Recently, I have been studying Python concurrency, so I have made a summary of multiprocessing, multithreading, asynchronous, and coroutines.
I. Multithreading

Multithreading allows multiple control flows to exist within a single process, so that multiple functions can be active simultaneously, thereby allowing the operations of multiple functions to run concurrently. Even on a single CPU computer, the effect of concurrent multithreading can be achieved by continuously switching between instructions in different threads.

Multithreading is equivalent to a concurrency system. Concurrency systems generally execute multiple tasks simultaneously. When multiple tasks can share resources, especially when writing to a variable at the same time, the issue of synchronization needs to be addressed, such as in a multithreading train ticketing system: one instruction checks if the tickets are sold out, and another instruction allows multiple windows to sell tickets simultaneously, which may result in selling non-existent tickets.

In concurrent situations, the order of instruction execution is determined by the kernel. Within the same thread, instructions are executed in the order of their occurrence, but it is difficult to say which will execute first between different threads. Therefore, it is necessary to consider the synchronization issues of multi-threading. Synchronization (synchronization) refers to allowing only one thread to access a resource at a time.

1The thread module

2The threading module
threading.Thread creates a thread.

To judge whether there are remaining tickets and to sell tickets, a mutex lock is added to prevent one thread from just checking that there are no remaining tickets while another thread performs the ticket selling operation.

#! /usr/bin/python
#-* coding: utf-8 -*
# __author__ ="tyomcat"
import threading
import time
import os
def booth(tid):
  global i
  global lock
  while True:
    lock.acquire()
    if i != 0:
      i=i-1
      print "Window:", tid, ", Remaining tickets:", i
      time.sleep(1)
    else:
      print "Thread_id", tid, "No more tickets"
      os._exit(0)
    lock.release()
    time.sleep(1)
i = 100
lock = threading.Lock()
for k in range(10)
  new_thread = threading.Thread(target=booth, args=(k,))
  new_thread.start()

Secondly, coroutines (also known as microthreads or fibers)

Coroutines, unlike preemptive scheduling of threads, are cooperative scheduling. Coroutines are also single-threaded, but they can make it possible to use asynchronous operations that would normally require threading.+Non-human code written in callback style can be written in a seemingly synchronous manner.

1Coroutines in Python can be implemented by generators (generator).

Firstly, a solid understanding of generators and 'yield' is required.

Calling a regular Python function usually starts from the first line of code in the function and ends with a 'return' statement, an exception, or the function's execution (which can also be considered as implicitly returning None).

Once the function returns control to the caller, it means that it is all over. Sometimes, a function that can generate a sequence can be created to 'save its work', which is a generator (a function that uses the 'yield' keyword).

The ability to 'generate a sequence' is because the function does not return as usually understood. The implicit meaning of 'return' is that the function is returning control to the place where the function was called. While the implicit meaning of 'yield' is that the transfer of control is temporary and voluntary, our function will still regain control in the future.

Let's take a look at the producer/Example of consumer:

#! /usr/bin/python
#-* coding: utf-8 -*
# __author__ ="tyomcat"
import time
import sys
# Producer
def produce(l):
  i=0
  while 1:
    if i < 10:
      l.append(i)
      yield i
      i=i+1
      time.sleep(1)
    else:
      return   
# Consumer
def consume(l):
  p = produce(l)
  while 1:
    try:
      p.next()
      while len(l) > 0:
        print l.pop()
    except StopIteration:
      sys.exit(0)
if __name__ == "__main__":
  l = []
  consume(l)

When the program executes to produce's yield i, it returns a generator and pauses execution. When we call p.next() in custom, the program returns to produce's yield i to continue execution. As a result, an element is appended to l, and then we print l.pop(). This continues until p.next() raises a StopIteration exception.

2Stackless Python

3Greenlet module

The implementation based on greenlet is only second to Stackless Python in terms of performance, about half as slow as Stackless Python, and nearly one order of magnitude faster than other solutions. In fact, greenlet is not a real concurrency mechanism, but a mechanism that switches between execution code blocks of different functions within the same thread, implementing 'you run for a while, I run for a while', and it must specify when to switch and to where during the switching.

4Eventlet module

3. Multi-processing
1Subprocess (subprocess package)

In Python, external programs can be run by forking a subprocess using the subprocess package.

When invoking system commands, the os module is usually the first one considered. Operations are performed using os.system() and os.popen(). However, these commands are too simple and cannot complete some complex operations, such as providing input to the running command or reading the output of the command, judging the running status of the command, managing parallel execution of multiple commands, and so on. At this point, the Popen command in subprocess can effectively complete the operations we need.

>> import subprocess
>> command_line = raw_input()
ping -c 10 www.baidu.com
>> args = shlex.split(command_line)
>> p = subprocess.Popen(args)

Use subprocess.PIPE to connect the input and output of multiple subprocesses into a pipe (pipe):

import subprocess
child1 child = subprocess.Popen(["ls","-l"], stdout=subprocess.PIPE)
child2 child = subprocess.Popen(["wc"], stdin=child1.stdout, stdout=subprocess.PIPE)
out = child2.communicate()
print(out)

communicate() method reads data from stdout and stderr and inputs it into stdin.

2）、Multi-process (multiprocessing package)

(1）、The multiprocessing package is a multi-process management package in Python. Similar to threading.Thread, it can use the multiprocessing.Process object to create a process.

The process pool (Process Pool) can create multiple processes.

apply_async(func, args) Take a process from the process pool to execute func, args are the parameters of func. It will return an AsyncResult object, and you can call the get() method on the object to obtain the result.

close() The process pool will no longer create new processes

join() wait for all processes in the process pool. You must call the close() method of Pool first before calling join().

#! /usr/bin/env python
# -*- coding:utf-8  -*-
# __author__ == "tyomcat"
# "My computer has4number of cpu"
from multiprocessing import Pool
import os, time
def long_time_task(name):
  print 'Run task %s (%s)...' % (name, os.getpid())
  start = time.time()
  time.sleep(3)
  end = time.time()
  print 'Task %s runs %0.2f seconds.' % (name, (end - start))
if __name__=='__main__':
  print 'Parent process %s.' % os.getpid()
  p = Pool()
  for i in range(4)
    p.apply_async(long_time_task, args=(i,))
  print 'Waiting for all subprocesses done...'
  p.close()
  p.join()
  print 'All subprocesses done.'

(2）、multi-process shared resources

Through shared memory and Manager objects: use one process as the server and establish a Manager to truly store resources.

Other processes can access the Manager through parameters or by address, establish a connection, and operate on resources on the server.

#! /usr/bin/env python
# -*- coding:utf-8  -*-
# __author__ == "tyomcat"
from multiprocessing import Queue,Pool
import multiprocessing,time,random
def write(q):
  for value in ['A','B','C','D']:
    print "Put %s to Queue!" % value
    q.put(value)
    time.sleep(random.random())
def read(q,lock):
  while True:
    lock.acquire()
    if not q.empty():
      value=q.get(True)
      print "Get %s from Queue" % value
      time.sleep(random.random())
    else:
      break
    lock.release()
if __name__ == "__main__":
  manager=multiprocessing.Manager()
  q=manager.Queue()
  p=Pool()
  lock=manager.Lock()
  pw=p.apply_async(write,args=(q,))
  pr=p.apply_async(read,args=(q,lock))
  p.close()
  p.join()
  print
  print "All data have been written and read"

IV. Asynchronous

Whether it is a thread or a process, they use the same synchronization mechanism. When a block occurs, the performance will drop significantly, which cannot fully utilize the CPU potential, waste hardware investment, and more importantly, cause the iron plate of software modules, tight coupling, which cannot be cut,不利于 future expansion and change.

Whether it is a process or a thread, each blocking and switching requires a system call (system call), first let the CPU run the scheduler of the operating system, and then the scheduler decides which process (thread) to run. When multiple threads access mutually exclusive code, locks need to be added.

Currently popular asynchronous servers are all based on event-driven (such as nginx).

In an asynchronous event-driven model, operations that would cause blocking are converted into asynchronous operations. The main thread is responsible for initiating this asynchronous operation and handling the result of the operation. Since all blocking operations are converted into asynchronous operations, theoretically, most of the time of the main thread is spent on actual computing tasks, which reduces the scheduling time of multi-threading, so the performance of this model is usually better.

That's all for this article. I hope it will be helpful to everyone's learning, and I also hope everyone will support the Yelling Tutorial more.

Statement: The content of this article is from the Internet, and the copyright belongs to the original author. The content is contributed and uploaded by Internet users spontaneously. This website does not own the copyright, has not been manually edited, and does not assume any relevant legal liability. If you find any content suspected of copyright infringement, please feel free to send an email to: notice#oldtoolbag.com (Please replace # with @ when sending an email to report violations, and provide relevant evidence. Once verified, this site will immediately delete the infringing content.)

Basic Tutorial