Python core

Keywords: Python Programming network REST

Python orchestration

Basics

Co programming is a way to realize concurrent programming

example

  • Simple reptiles
import time

def crawl_page(url):
    print('crawling {}'.format(url))
    sleep_time = int(url.split('_')[-1])
    time.sleep(sleep_time)
    print('OK {}'.format(url))

def main(urls):
    for url in urls:
        crawl_page(url)

%time main(['url_1', 'url_2', 'url_3', 'url_4'])

########## output ##########

crawling url_1
OK url_1
crawling url_2
OK url_2
crawling url_3
OK url_3
crawling url_4
OK url_4
Wall time: 10 s

This is a simple crawler. When the main() function executes, call crawl_ The page() function performs network communication, receives the result after waiting for several seconds, and then executes the next one.

  • The writing method of using association process
import asyncio

async def crawl_page(url):
    print('crawling {}'.format(url))
    sleep_time = int(url.split('_')[-1])
    await asyncio.sleep(sleep_time)
    print('OK {}'.format(url))

async def main(urls):
    for url in urls:
        await crawl_page(url)

%time asyncio.run(main(['url_1', 'url_2', 'url_3', 'url_4']))

########## output ##########

crawling url_1
OK url_1
crawling url_2
OK url_2
crawling url_3
OK url_3
crawling url_4
OK url_4
Wall time: 10 s

First, import asyncio, which contains most of the magic tools needed to implement the cooperation.

async modifiers declare asynchronous functions, where crawle_ Both page and main become asynchronous functions, and when asynchronous functions are called, a co program object can be obtained.

For example, if print(crawl_page(')), it will output < coroutine object crawl_ Page at XXXXXX >, which prompts you that this is a Python coroutine object, rather than actually executing this function.

Execution of the agreement

First, it can be called through await. The execution effect of await is the same as that of Python. That is to say, the program will block here, enter the called coroutine function, and continue after the execution is finished, which is the literal meaning of await.

Await in the code asyncio.sllep (sleep_ Tiem) will rest here for a few seconds, await crawl_page(url) will execute crawl_page() function.

Second, it can be asyncio.create_task() to create the task.

Finally, we need asyncio.run To trigger the run. asyncio.run This function is a feature of Python 3.7 later.

The result is still 10 seconds, because await is a synchronous call, so crawl_page(url) will not trigger the next call until the current call ends.

An important concept of collaboration: task

import asyncio

async def crawl_page(url):
    print('crawling {}'.format(url))
    sleep_time = int(url.split('_')[-1])
    await asyncio.sleep(sleep_time)
    print('OK {}'.format(url))

async def main(urls):
    tasks = [asyncio.create_task(crawl_page(url)) for url in urls]
    for task in tasks:
        await task

%time asyncio.run(main(['url_1', 'url_2', 'url_3', 'url_4']))

########## output ##########

crawling url_1
crawling url_2
crawling url_3
crawling url_4
OK url_1
OK url_2
OK url_3
OK url_4
Wall time: 3.99 s

Once you have the process object, you can pass the asyncio.creat_task to create a task.

After the task is created, it will be scheduled to execute soon, so that the code will not block the task here.

Another way to perform task s

import asyncio

async def crawl_page(url):
    print('crawling {}'.format(url))
    sleep_time = int(url.split('_')[-1])
    await asyncio.sleep(sleep_time)
    print('OK {}'.format(url))

async def main(urls):
    tasks = [asyncio.create_task(crawl_page(url)) for url in urls]
    await asyncio.gather(*tasks)

%time asyncio.run(main(['url_1', 'url_2', 'url_3', 'url_4']))

########## output ##########

crawling url_1
crawling url_2
crawling url_3
crawling url_4
OK url_1
OK url_2
OK url_3
OK url_4
Wall time: 4.01 s

*task unpacks the list and turns it into a function parameter; on the contrary, * * dict turns the dictionary into a function parameter.

Note: asyncio.create_task, asyncio.run All are provided by Python 3.7 and above.

Decryption process runtime

import asyncio

async def worker_1():
    print('worker_1 start')
    await asyncio.sleep(1)
    print('worker_1 done')

async def worker_2():
    print('worker_2 start')
    await asyncio.sleep(2)
    print('worker_2 done')

async def main():
    print('before await')
    await worker_1()
    print('awaited worker_1')
    await worker_2()
    print('awaited worker_2')

%time asyncio.run(main())

########## output ##########

before await
worker_1 start
worker_1 done
awaited worker_1
worker_2 start
worker_2 done
awaited worker_2
Wall time: 3 s
import asyncio

async def worker_1():
    print('worker_1 start')
    await asyncio.sleep(1)
    print('worker_1 done')

async def worker_2():
    print('worker_2 start')
    await asyncio.sleep(2)
    print('worker_2 done')

async def main():
    task1 = asyncio.create_task(worker_1())
    task2 = asyncio.create_task(worker_2())
    print('before await')
    await task1
    print('awaited worker_1')
    await task2
    print('awaited worker_2')

%time asyncio.run(main())

########## output ##########

before await
worker_1 start
worker_2 start
worker_1 done
awaited worker_1
worker_2 done
awaited worker_2
Wall time: 2.01 s

process analysis

One asyncio.run (main()), the program enters the main() function, and the event cycle is opened;

2. Task 1 and task 2 are created and enter the time cycle to wait for running; run to print and output 'before await';

3.await task1 is executed. The user selects to cut out from the current main task, and the time scheduler starts to schedule the worker_1;

4.worker_1 start running, run print output 'worker_ 1 start 'and run to await asyncio.sleep(1) From the current task, the time scheduler starts to schedule the worker_2;

5.worker_2 start running, run print output 'worker_ 2 start 'and run await asyncio.sleep(2) Cut from the current task;

6. The running time of all the above time should be between 1ms and 10ms, or even shorter. The time scheduler will suspend scheduling from this time;

7. In a second, worker_ The sleep of 1 is completed, and the time scheduler transmits the control weight to the task_1. Output 'worker_1 done’,task_1. Complete the task and exit from the event cycle;

8. After await Task1 is completed, the time scheduler will transfer the controller to the main task and output 'await worker'_ 1 ', and then continue to wait at await task2;

9. In two seconds, worker_ The sleep of 2 is completed, and the event scheduler transmits the control weight to the task_2. Output 'worker_2 done’,task_2. Complete the task and exit from the event cycle;

10. Main task output 'awaited worker_2 ', the whole task of the cooperation process ends, and the event cycle ends.

Example

  • You want to limit the running events for some collaboration tasks, and cancel them once they time out.
import asyncio

async def worker_1():
    await asyncio.sleep(1)
    return 1

async def worker_2():
    await asyncio.sleep(2)
    return 2 / 0

async def worker_3():
    await asyncio.sleep(3)
    return 3

async def main():
    task_1 = asyncio.create_task(worker_1())
    task_2 = asyncio.create_task(worker_2())
    task_3 = asyncio.create_task(worker_3())

    await asyncio.sleep(2)
    task_3.cancel()

    res = await asyncio.gather(task_1, task_2, task_3, return_exceptions=True)
    print(res)

%time asyncio.run(main())

########## output ##########

[1, ZeroDivisionError('division by zero'), CancelledError()]
Wall time: 2 s

worker_1 normal operation, worker_2 error in operation, worker_3 if the execution time is too long and cancel led, all the information will be reflected in the final returned result res.

Note that return_exceptions=True. If you do not set this parameter, the error will be throw n to the execution layer completely, so you need to try except ion to catch it, which means that all other tasks that have not been executed are canceled.

To avoid this situation, return_ Just set exceptions to True.

Summary

  • There are two main differences between orchestration and multithreading: one is that the orchestration is a single thread; the other is that the user decides where to give up control and switch to the next task.

  • The writing method of synergetic process is more concise and clear

  • When writing a collaboration program, you should have a clear concept of event loop, know when the program needs to pause, wait for I/O, and when it needs to be executed together.

Posted by jerastraub on Mon, 22 Jun 2020 22:14:24 -0700