python concurrent process

Keywords: Python less

In the operating system, process is the smallest unit of resource allocation, and thread is the smallest unit of CPU scheduling.

Coroutine: it is the concurrency under single thread, also known as micro thread and fiber. The English name is coroutine. In a word: the cooperation process is a kind of user state lightweight thread, that is, the cooperation process is controlled and scheduled by the user program itself. In other words, programmers use code to control switching

Reference: http://www.cnblogs.com/Eva-J/articles/8324673.html

# The operating system is responsible for calling between processes
# Threads start multiple threads CPU The smallest unit of execution is actually a thread
    # Start a thread to create a thread register stack
    # Close a thread

# Association
    # Essentially a thread
    # Can switch between multiple tasks to save some IO time
    # Switching between tasks in the cooperation process also consumes time,But the cost is far less than the switching between process threads
# Means to achieve concurrency

import time
def consumer():
    while True:
        x = yield
        time.sleep(1)
        print('Processing data :',x)

def producer():
    c = consumer()
    next(c)
    for i in range(10):
        time.sleep(1)
        print('production data:',i)
        c.send(i)

# This producer consumer model simulates the program switching back and forth, but it cannot be avoided IO time
producer()

 

Use PIP3 install green and pip3 install gevent to install the module. Continue:

# The real cooperation module is to use greenlet Switch completed
from greenlet import greenlet

def eat():
    print('eating start')
    g2.switch()   # Switch to g2
    print('eating end')
    g2.switch()

def play():
    print('playing start')
    g1.switch()  # Switch to g1
    print('playing end')

g1 = greenlet(eat)  # Entrusted to g1
g2 = greenlet(play)
g1.switch()
  • greenlet can implement the cooperation process, but it is too troublesome to point to the next cooperation process manually every time. python also has a more powerful module than greenlet, gevent, which can automatically switch tasks

Reference: https://www.cnblogs.com/PrettyTom/p/6628569.html

# The cooperation process is a kind of user state lightweight thread, that is, the cooperation process is controlled and scheduled by the user program itself.
import time
import gevent

def eat():
    print('eating start')
    # time.sleep(1)      # gevent cannot sense time.sleep time
    gevent.sleep(1)
    print('eating end')

def play():
    print('playing start')
    gevent.sleep(1)
    print('playing end')

g1 = gevent.spawn(eat)
g2 = gevent.spawn(play)
g1.join()
g2.join()

 

The right way for gevent:

## Import this sentence to package all blocking IO in all modules. You can sense time.sleep
from gevent import monkey;monkey.patch_all()
import time
import gevent
import threading

def eat():
    print(threading.current_thread().getName())  # Dummy Fake, virtual.
    print(threading.current_thread())
    print('eating start')
    time.sleep(1.2)
    print('eating end')

def play():
    print(threading.current_thread().getName())
    print(threading.current_thread())
    print('playing start')
    time.sleep(1)
    print('playing end')

g1 = gevent.spawn(eat)   # Register to process, encountered IO Will switch automatically
g2 = gevent.spawn(play)
# g1.join()
# g2.join()
gevent.joinall([g1,g2])
print('master')
# Task switching between processes and threads is done by the operating system
# The switch between cooperation tasks is controlled by the program(Code)complete,Only when the cooperation module can recognize IO During operation,The program will switch tasks,Achieve the effect of concurrency

 

Synchronous and asynchronous:

# Synchronous and asynchronous
from gevent import monkey;monkey.patch_all()
import time
import gevent

def task(n):
    time.sleep(1)
    print(n)

def sync():
    for i in range(5):
        task(i)

def async():
    g_lst = []
    for i in range(5):
        g = gevent.spawn(task,i)
        g_lst.append(g)
    gevent.joinall(g_lst)  # for g in g_lst:g.join()

sync()   # synchronization
async()  # asynchronous

 

Use concurrency in crawler

# Association : The concept of being able to achieve concurrency in one thread
    #    Can avoid some tasks IO operation
    #    During the execution of the task,Detected IO Switch to another task

# Multithreading has been weakened
# Improve the process on one thread CPU Utilization ratio
# The efficiency of cooperation process is faster than that of multithreading

# Examples of reptiles
# During the request IO wait for
from gevent import monkey;monkey.patch_all()
import gevent
from urllib.request import urlopen    # Built in modules

def get_url(url):
    response = urlopen(url)
    content = response.read().decode('utf-8')
    return len(content)

g1 = gevent.spawn(get_url,'http://www.baidu.com')
g2 = gevent.spawn(get_url,'http://www.sogou.com')
g3 = gevent.spawn(get_url,'http://www.taobao.com')
g4 = gevent.spawn(get_url,'http://www.hao123.com')
g5 = gevent.spawn(get_url,'http://www.cnblogs.com')
gevent.joinall([g1,g2,g3,g4,g5])
print(g1.value)
print(g2.value)
print(g3.value)
print(g4.value)
print(g5.value)

ret = get_url('http://www.baidu.com')
print(ret)

Posted by php_guest on Sat, 30 Nov 2019 17:41:15 -0800