python co program generator

Co process, also known as micro thread, fiber process. The English name is Coroutine.

Threads are system level and they are scheduled by the operating system, while coroutines are program level and are scheduled by the program itself as required. There are many functions in a thread. We call these functions subroutines. During the execution of subroutines, other subroutines can be interrupted to execute other subroutines, and other subroutines can also be interrupted to continue to execute the previous subroutines. This process is called cooperation. In other words, a piece of code in the same thread will be interrupted during execution, then jump to execute other code, and then continue to execute at the place interrupted before, similar to yield operation.

The association has its own register context and stack. In the process of scheduling switch, register context and stack are saved to other places, and the previously saved register context and stack are restored when switching back. Therefore: the process can retain the state of the last call (that is, a specific combination of all local states). Each time the process is re entered, it is equivalent to entering the state of the last call. In other words, it is the location of the logical flow when entering the last leave. Because the process is executed by a thread, how to use multi-core CPU? The simplest method is multi process + CO process, which can not only make full use of multi-core, but also give full play to the high efficiency of the co process, and obtain high performance.

Advantages of the process:

Without the overhead of thread context switching, the cooperation process avoids meaningless scheduling, which can improve performance (however, the programmer must take the responsibility of scheduling himself, meanwhile, the cooperation process also loses the ability of standard threads to use multi CPU)
No overhead of atomic operation locking and synchronization
Easy to switch control flow and simplify programming model
High concurrency + high scalability + low cost: it is not a problem that a CPU supports tens of thousands of coprocesses. So it is suitable for high concurrency processing.

Disadvantages of cooperation:

Can't use multi-core resources: the essence of a co program is a single thread. It can't use multiple cores of a single CPU at the same time. A co program needs to work with a process to run on multiple CPUs. Of course, most of the applications we write in our daily life don't need this, unless they are CPU intensive applications.
Blocking operations, such as IO, block the entire program

generator

To understand a process, you first need to know what a generator is. The generator is actually a function that continuously produces values, but in the function, you need to use the keyword yield to produce values. The general function will return a value and exit after execution, but the generator function will automatically suspend, and then pick up the urgent execution again. He will use the yield keyword to close the function, return a value to the caller, and keep the current enough state, so that the function can continue to execute. The generator is closely related to the iteration protocol, and the iterator has A member method, which either returns the next item of the iteration, or causes an exception to end the iteration.

# When a function has yield, the function name + () becomes a generator
# return represents the termination of the generator in the generator, and directly reports an error
# The next function is to wake up and continue
# The function of send is to wake up and continue, send a message to the generator

def create_counter(n):
    print("create_counter")
    while True:
        yield n
        print("increment n")
        n += 1

if __name__ == '__main__':
    gen = create_counter(2)
    print(next(gen))
    for item in gen:
        print('--->:{}'.format(item))
        if item > 5:
            break

print:
create_counter
2
increment n
--->:3
increment n
--->:4
increment n
--->:5
increment n
--->:6

Python 2 process

Python's support for orchestrations is implemented through the generator. In generator, we can not only iterate through the for loop, but also call the next() function to get the next value returned by the yield statement. But Python's yield can not only return a value, it can also receive parameters from the caller.

The traditional producer consumer model is that a thread writes a message, a thread takes a message and controls the queue and wait through the lock mechanism, but it may deadlock if it is not careful. If the process is used instead, after the producer produces the message, it will jump to the consumer to start execution directly through yield. After the consumer completes execution, it will switch back to the producer to continue production, with high efficiency:

Producer consumer model of CO process implementation

def consumer():
    r = ''
    while True:
        n = yield r #n is the parameter sent
        if not n:
            return
        print('[CONSUMER] Consuming %s...' % n)
        r = '200 OK'

def produce(c):
    c.send(None) #Transfer parameters, and q
    n = 0
    while n < 5:
        n = n + 1
        print('[PRODUCER] Producing %s...' % n)
        r = c.send(n)
        print('[PRODUCER] Consumer return: %s' % r)
    c.close()

if __name__ == '__main__':
    c = consumer()
    produce(c)

//Operation result:
[PRODUCER] Producing 1...
[CONSUMER] Consuming 1...
[PRODUCER] Consumer return: 200 OK
[PRODUCER] Producing 2...
[CONSUMER] Consuming 2...
[PRODUCER] Consumer return: 200 OK
[PRODUCER] Producing 3...
[CONSUMER] Consuming 3...
[PRODUCER] Consumer return: 200 OK
[PRODUCER] Producing 4...
[CONSUMER] Consuming 4...
[PRODUCER] Consumer return: 200 OK
[PRODUCER] Producing 5...
[CONSUMER] Consuming 5...
[PRODUCER] Consumer return: 200 OK

Notice that the consumer function is a generator. After passing a consumer into produce:

First call c.send(None) to start the generator;
Then, once something is produced, switch to consumer for execution through c.send(n);
consumer gets the message through yield, processes it, and returns the result through yield;
Produce takes the result of consumer processing and continues to produce the next message;
Produce decides not to produce. Close the consumer through c.close(). The whole process is over.

The whole process is unlocked and executed by one thread. produce and consumer cooperate to complete the task, so it is called "collaboration", rather than thread preemptive multitask.

Python 3.5 asyncio

Asyncio is a standard library introduced in Python version 3.4, which has built-in support for asynchronous io. Using @ asyncio.coroutine provided by asyncio, you can mark a generator as coroutine type, and then use yield from to call another coroutine inside coroutine to implement asynchronous operation. In order to simplify and better identify asynchronous IO, new syntax async and await are introduced from Python 3.5, which can make the code of coroutine more concise and readable.

Synchronously executed process

import asyncio
import time
# Use the await of the main function to launch two Ctrip programs. At this time, the code is still synchronized,
# When the first await is finished, the second one will be started. This is that their operation is consistent with the function
async def say_after(delay, what):
    await asyncio.sleep(delay)
    print('--->:{}'.format(what))
async def main():
    print(f"started at {time.strftime('%X')}")
    await say_after(1, 'hello')
    await say_after(2, 'world')
    print(f"finished at {time.strftime('%X')}")
if __name__ == '__main__':
    asyncio.run(main())

//Execution result:
2019-12-15 18:03:52,719 - asyncio - DEBUG - Using proactor: IocpProactor
started at 18:03:52
hello
world
finished at 18:03:55

Concurrent processes

import asyncio
import time

# What's different from the previous example is that the way to start the cooperation process here is to start the task task, which is considered to be a waiting object, so they can run concurrently. This example will save one second compared with the previous example
async def say_after(delay, what):
    # Why use this approach to simulate waiting? Because time.sleep(delay) is not considered as a wait object by asyncio, the expected time.sleep() will not appear when it is replaced with time.sleep()
    await asyncio.sleep(delay)
    print('--->:{}'.format(what))

async def main():
    print(f"started at {time.strftime('%X')}")
    # Used to create a collaboration task
    task1 = asyncio.create_task(say_after(1,'hello'))
    task2 = asyncio.create_task(say_after(2,'world'))
    # Although the concurrent start task is executed concurrently, in Python, the program will wait for the most time-consuming process to run and then exit, so it takes 2 seconds here
    await task1
    await task2
    print(f"finished at {time.strftime('%X')}")
if __name__ == '__main__':
    asyncio.run(main())

//Execution result:
2019-12-15 18:18:50,374 - asyncio - DEBUG - Using proactor: IocpProactor
started at 18:18:50
--->:hello
--->:world
finished at 18:18:52

Common usage

Co processes on a single core

Tasks = [asyncio. Create [task (test (1)) for proxy in range (10000)] created tasks
[await t for t in tasks] to the execution queue

There are 10000 tasks here, taking 1.2640011310577393 seconds

import asyncio
import time

async def test(time):
    await asyncio.sleep(time)

async def main():
    start_time = time.time()
    tasks = [asyncio.create_task(test(1)) for proxy in range(10000)]
    [await t for t in tasks]
    print(time.time() - start_time)

if __name__ == "__main__":
    asyncio.run(main())

//Execution result:
2019-12-15 18:15:17,668 - asyncio - DEBUG - Using proactor: IocpProactor
1.1492626667022705

Co processes on multicore

For multi-core applications, of course, multi process + CO process is used

from multiprocessing import Pool
import asyncio
import time

async def test(time):
    await asyncio.sleep(time)

async def main(num):
    start_time = time.time()
    tasks = [asyncio.create_task(test(1)) for proxy in range(num)]
    [await t for t in tasks]
    print(time.time() - start_time)

def run(num):
    asyncio.run(main(num))

if __name__ == "__main__":
    p = Pool()
    for i in range(4):
        p.apply_async(run, args=(2500,))
    p.close()
    p.join()

//Execution result:
1.0610532760620117
1.065042495727539
1.0760507583618164
1.077049970626831

Posted by chandler on Sun, 15 Dec 2019 02:56:32 -0800

Programmer Group