Python multitasking protocol
Keywords:
Python
less
network
IPython
Preface
The core point of the collaboration is the use of the collaboration, that is, you only need to know how to use the collaboration; but if you want to know how to achieve the collaboration, you need to know the iterator, the generator in turn.
If you only want to see the use of the coordinator, you just need to look at the first part; if you want to understand the coordinator, you can read this blog in sequence, or in iterator-generator-coordinator order.
Association
- The yield generator is a special iterator.
- greenlet encapsulates yield.
- gevent encapsulates greenlet.
- When a gevent encounters a delayed operation, it changes tasks to execute, where the delayed operation can be waiting for server resources or sleep, etc.
The above concepts will be explained in the following knowledge points.
greenlet implements multitasking
To use greenlet, first install Greenlet
pip3 install greenlet
greenlet implements multitasking code
from greenlet import greenlet
import time
def task1():
while 1:
print("---1---")
gr2.switch()
time.sleep(1)
def task2():
while 1:
print("---2---")
gr1.switch()
time.sleep(1)
gr1 = greenlet(task1)
gr2 = greenlet(task2)
# Switch to gr1 Execute in
gr1.switch()
greenlet implements multitasking
Note, however, that this is actually a single thread; and after testing, the last few sentences here can not be used main, otherwise they will report errors;
gevent implements multitasking
As you can see, Greenlet can already achieve the cooperation, but we need to switch tasks manually, which will be very troublesome. So we need to learn about gevent and encapsulate it on the basis of greenlet, which can help us realize the task of automatic switching.
To use gevent, use it to install
pip3 install gevent
gevent implements multitasking code
import time
import gevent
def test1(n):
for i in range(n):
print("---test1---", gevent.getcurrent(), i)
# time.sleep(0.5) # Time sleep here does not cause task switching due to time-consuming
gevent.sleep(0.5)
def test2(n):
for i in range(n):
print("---test2---", gevent.getcurrent(), i)
# time.sleep(0.5) # Time sleep here does not cause task switching due to time-consuming
gevent.sleep(0.5)
def test3(n):
for i in range(n):
print("---test3---", gevent.getcurrent(), i)
# time.sleep(0.5) # Using time-consuming sleep here does not cause task switching due to time-consuming
gevent.sleep(0.5)
g1 = gevent.spawn(test1, 5)
g2 = gevent.spawn(test2, 5)
g3 = gevent.spawn(test3, 5)
g1.join()
g2.join()
g3.join()
gevent implements multitasking.py
Operation results:
---test1--- <Greenlet at 0x1e9e64c2598: test1(5)> 0
---test2--- <Greenlet at 0x1e9e64c26a8: test2(5)> 0
---test3--- <Greenlet at 0x1e9e64c27b8: test3(5)> 0
---test1--- <Greenlet at 0x1e9e64c2598: test1(5)> 1
---test2--- <Greenlet at 0x1e9e64c26a8: test2(5)> 1
---test3--- <Greenlet at 0x1e9e64c27b8: test3(5)> 1
---test1--- <Greenlet at 0x1e9e64c2598: test1(5)> 2
---test2--- <Greenlet at 0x1e9e64c26a8: test2(5)> 2
---test3--- <Greenlet at 0x1e9e64c27b8: test3(5)> 2
---test1--- <Greenlet at 0x1e9e64c2598: test1(5)> 3
---test2--- <Greenlet at 0x1e9e64c26a8: test2(5)> 3
---test3--- <Greenlet at 0x1e9e64c27b8: test3(5)> 3
---test1--- <Greenlet at 0x1e9e64c2598: test1(5)> 4
---test2--- <Greenlet at 0x1e9e64c26a8: test2(5)> 4
---test3--- <Greenlet at 0x1e9e64c27b8: test3(5)> 4
Operation result
g1.join() means waiting for g1 execution to complete; when we create an object with span, we do not execute the coroutine, but when the main thread comes to wait for g1 to complete, we need to wait here, so we execute the coroutine.
Note that if sleep() is to be used in gevent, gevent.sleep() must be used.
There is a problem when we create g1, g2, g3, if we carelessly create all g1, the result is almost the same as that without writing errors.
Question Edition Running Results
g1 = gevent.spawn(test1, 5)
g2 = gevent.spawn(test2, 5)
g3 = gevent.spawn(test3, 5)
g1.join()
g1.join()
g1.join()
---test1--- <Greenlet at 0x17d8ef12598: test1(5)> 0
---test2--- <Greenlet at 0x17d8ef126a8: test2(5)> 0
---test3--- <Greenlet at 0x17d8ef127b8: test3(5)> 0
---test1--- <Greenlet at 0x17d8ef12598: test1(5)> 1
---test2--- <Greenlet at 0x17d8ef126a8: test2(5)> 1
---test3--- <Greenlet at 0x17d8ef127b8: test3(5)> 1
---test1--- <Greenlet at 0x17d8ef12598: test1(5)> 2
---test2--- <Greenlet at 0x17d8ef126a8: test2(5)> 2
---test3--- <Greenlet at 0x17d8ef127b8: test3(5)> 2
---test1--- <Greenlet at 0x17d8ef12598: test1(5)> 3
---test2--- <Greenlet at 0x17d8ef126a8: test2(5)> 3
---test3--- <Greenlet at 0x17d8ef127b8: test3(5)> 3
---test1--- <Greenlet at 0x17d8ef12598: test1(5)> 4
---test2--- <Greenlet at 0x17d8ef126a8: test2(5)> 4
---test3--- <Greenlet at 0x17d8ef127b8: test3(5)> 4
Question Edition Running Results
The core of the protocol is to use the delayed operation to do other tasks.
Patch gevent
When we use gevent, if we want to delay operations, such as waiting for network resources or time.sleep(), we must use gevent.sleep(), that is, every delay operation needs to be changed to gevent delay; if we want to, or according to the original writing, and use gevent, how to achieve it? In this case, we use patching to solve our doubts. Just add the following line of code to the code that uses gevent to complete the patch
from gevent import monkey
monkey.patch_all()
Using patching to complete the use of the cooperation process
import time
import gevent
from gevent import monkey
monkey.patch_all()
def test1(n):
for i in range(n):
print("---test1---", gevent.getcurrent(), i)
time.sleep(0.5) # Equivalent to patches gevent.sleep(0.5)
def test2(n):
for i in range(n):
print("---test2---", gevent.getcurrent(), i)
time.sleep(0.5)
def test3(n):
for i in range(n):
print("---test3---", gevent.getcurrent(), i)
time.sleep(0.5)
g1 = gevent.spawn(test1, 5)
g2 = gevent.spawn(test2, 5)
g3 = gevent.spawn(test3, 5)
g1.join()
g2.join()
g3.join()
Patch gevent.py
Patch gevent so that time-consuming operations such as time.sleep(1) are equivalent to gevent.sleep(1);
Use of gevent.joinall()
If we have a lot of functions to call, don't we have to create them every time? In join(), gevent provides a simple way;
import time
import gevent
from gevent import monkey
monkey.patch_all()
def test1(n):
for i in range(n):
print("---test1---", gevent.getcurrent(), i)
time.sleep(0.5) # In the case of patching, it is equivalent to gevent.sleep(0.5)
def test2(n):
for i in range(n):
print("---test2---", gevent.getcurrent(), i)
time.sleep(0.5)
def test3(n):
for i in range(n):
print("---test3---", gevent.getcurrent(), i)
time.sleep(0.5)
gevent.joinall([
gevent.spawn(test1, 5), # In parentheses, the first is the function name, and the second is the parameter.
gevent.spawn(test2, 5),
gevent.spawn(test3, 5),
])
Use of gevent.joinall(). py
Cooperative use of small cases - picture downloader
import urllib.request
import gevent
from gevent import monkey
monkey.patch_all()
def img_download(img_name, img_url):
req = urllib.request.urlopen(img_url)
data = req.read()
with open("images/"+img_name, "wb") as f:
f.write(data)
def main():
gevent.joinall([
gevent.spawn(img_download, "1.jpg", "https://rpic.douyucdn.cn/live-cover/appCovers/2019/05/13/6940298_20190513113912_small.jpg"),
gevent.spawn(img_download, "2.jpg", "https://rpic.douyucdn.cn/asrpic/190513/2077143_6233919_0d516_2_1818.jpg"),
gevent.spawn(img_download, "3.jpg", "https://rpic.douyucdn.cn/live-cover/appCovers/2018/11/24/1771605_20181124143723_small.jpg")
])
if __name__ == "__main__":
main()
Use of the Consortium - Picture Downloader.py
Process, Thread, Thread Contrast
Difference
- Process is the unit of resource allocation
- Thread is the unit of operation system scheduling
- Process switching requires the most resources and is inefficient
- Thread switching requires general resources and efficiency (without GIL, of course)
- Cooperative Switching Task Resource is Small and Efficient
- Multiprocessing and multithreading may be parallel depending on the number of cpu cores, but the coroutines are concurrent in one thread.
- Multi-process consumes the most resources.
- When Python 3 runs a py file, it runs a process. In the process, there is a default thread, which is the main thread, and the main thread executes with code; that is, the process is the unit of resource allocation, and the thread is the real resource to execute, and the real scheduling of the operating system is the thread.
- There are two threads in a process, which we call multithreading and multitasking. The second is multithreading in a multiprocess.
- One of the characteristics of threads is that they can use one thread to perform other tasks while waiting for a resource to arrive.
- Without considering GIL, priority should be given to coroutines, threads and processes.
- The process is the most stable, one process will not affect other processes if something goes wrong, but it will consume more resources; threads spend less resources when switching tasks than threads; coroutines can use the waiting time of threads to do other things;
iterator
Iteration is a way of accessing set elements. Iterator is an object that can remember the location of traversal. Iterator objects are accessed from the first element of the collection until all elements are accessed. Iterators can only move forward and not backward.
- To understand the use of the protocol, first of all, we need to understand the generator.
- To understand generators, you first need to understand iterators.
I recommend a blog I read before: A thorough understanding of the concepts of Python Iterable, Iterator and Generator But it has nothing to do with this article, haha~
Before we learn about iterators, let's recognize two words
Iterable Iterable Iterable Iterable Iterable Iterable Iterable Objects
Iterator iterator
Iterable
Iterator Introducing-for Loop
In [1]: for i in [11,22,33]:
...: print(i)
11
22
33
In [2]: for i in "hhh":
...: print(i)
h
h
h
In [3]: for i in 10:
...: print(i)
...:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-3-309758a01ba4> in <module>()
----> 1 for i in 10:
2 print(i)
3
TypeError: 'int' object is not iterable # "int"Objects cannot be iterated
When using for loops, the data type behind in is iterative before using for loops, such as tuples, lists, strings, etc., and non-iterative, such as numbers, decimal points;
Judgment of Iterability
- To judge whether something is iteratable, we can judge whether the data type is a subclass of Iterable or if it is iterative.
- isinstance can be used to determine whether an object is created by a certain class or not.
- For example, we can use isinstance(a, A) to judge whether a is created for class A; the return value is True, which means that it can be iterated;
Determine whether the list is iterative:
from collections import Iterable
isinstance([11,22,33], Iterable)
True
isinstance judges whether the data type can be iterated
In [6]: from collections import Iterable
In [7]: isinstance([11,22], Iterable)
Out[7]: True
In [8]: isinstance((11,22), Iterable)
Out[8]: True
In [9]: isinstance(10, Iterable)
Out[9]: False
Tuples, lists and strings are iterative; numbers and decimal numbers are not iterative;
We call Iterable an object that can iteratively read a piece of data for us to use through statements like for...in....
Can you use for a class defined by yourself?
Create a class of your own to meet the need to traverse with a for loop
Non iterative
class Classmate(object):
"""docstring for Classmate"""
def __init__(self):
self.names = list()
def add(self, name):
self.names.append(name)
classmate = Classmate()
classmate.add("Zhang San")
classmate.add("Li Si")
classmate.add("Wang Wu")
for name in classmate:
print(name)
# TypeError: 'Classmate' object is not iterable
Essence of Iterable Objects
We analyze the process of iterating on an iterative object and find that every iteration (i.e. every iteration in... Will return the next data in the object, reading the data backwards until all the data is iterated. So, in this process, there should be a "person" to record the number of data accessed each time, so that each iteration can return the next data. We call this "human" who can help us iterate data an Iterator.
The essence of an iterative object is to provide us with an intermediate "human" that is, an iterator to help us iterate through it.
Iterable objects provide us with an iterator through the _iter_ method. When we iterate over an iterative object, we actually get an iterator provided by the object first, and then use this iterator to obtain each data in the object in turn.
That is to say, an object with _iter_ method is an iterative object.
If you don't understand the above, it doesn't matter. You just need to know that if you want to make a class you define iteratively, you just need to define a _iter_ method in that class.
Adding _iter_ Method
class Classmate(object):
"""docstring for Classmate"""
def __init__(self):
self.names = list()
def add(self, name):
self.names.append(name)
def __iter__(self):
pass
classmate = Classmate()
classmate.add("Zhang San")
classmate.add("Li Si")
classmate.add("Wang Wu")
for name in classmate:
print(name)
# TypeError: iter() returned non-iterator of type 'NoneType'
# iter()Return to " NoneType"Type of non-iterator
Note that the classmate is already an iterative object and can be verified by isinstance(classmate, Iterable).
However, if the _iter_() method is commented out, it will not be an iterative object, so it can be verified that the first step to be an iterative object is to add _iter_() method.
Iterative and Iterator
Iterative and Iterator
- There is _iter_ method in an object, which is called iteratable.
- If an object has _iter_ method and _iter_ method returns a reference to another object, and the returned object contains _iter_ and _next_ methods, then the returned object is called an iterator.
- As long as there is an iterator, the for method takes its value through the _next_ method in the iterator, and every for loop calls the _next_ method.
- With iter(xxxobj), the _iter_ method in xxxobj is automatically called, and the _iter_ method returns an iterator.
- Next (which can iterate the instance object, that is, __iter__ method returns an iterator), will automatically call the __next__ method in the iterator.
- An iterator is not necessarily an iterator.
- An iterator must be iterative.
- (Iterable -- there are _iter_ methods, iterators -- there are _iter_ and _next_ methods);
Judgment of Iterability
Take the following code as an example
for i in classmate
Technological process:
- 1. Judging whether classmate is iterative, that is, whether it contains _iter_ method;
- 2. If the first step is iterative, call iter(classmate), i.e. call the _iter_ method in the classmate class, return an iterator and get the return value.
- 3. Every for loop calls the _next_ method in the return value once, and gives i whatever _next_ returns.
Customize the use of for loop steps
- 1. Add _iter_ method to the class;
- 2. The _iter_ method returns a reference to an object that must contain _iter_ and _next_ methods.
- 3. In the class containing _iter_ and _next_ methods, write the return value of _next_ method;
The essence of the cycle for...in...
for item in Iterable
The essence of a loop is to obtain the iterator of Iterable by iter() function, and then call next() method to get the next value and assign it to item. When an exception to StopIteration is encountered, the loop ends.
Perfecting Custom Iterator
An object that implements the _iter_ method and _next_ method is an iterator.
Let the iterator return all data in its entirety;
import time
from collections.abc import Iterable, Iterator
class Classmate(object):
def __init__(self):
self.names = list()
def add(self, name):
self.names.append(name)
def __iter__(self):
return ClassmateIterable(self)
class ClassmateIterable(object):
def __init__(self, obj):
self.obj = obj
self.num = 0
def __iter__(self):
pass
def __next__(self):
# return self.obj.names[0]
try:
ret = self.obj.names[self.num]
self.num += 1
return ret
except IndexError as e:
raise StopIteration
def main():
classmate = Classmate()
classmate.add("Zhang San")
classmate.add("Li Si")
classmate.add("Wang Wu")
print("judge classmate Is it iterative?", isinstance(classmate, Iterable))
classmate_iterator = iter(classmate)
print("judge classmate_iterator Is it an iterator?", isinstance(classmate_iterator, Iterator))
# Call once __next__
print("classmate_iterator's next:", next(classmate_iterator))
for i in classmate:
print(i)
time.sleep(1)
if __name__ == '__main__':
main()
Implementing an Iterable Object by oneself
As you can see, it's now possible to implement for loops using custom classes; but in this code we see that in order to implement the return iterator, we need to define an additional class, which is more cumbersome. Here we can simplify it by returning our own class instead of another class and defining a _next_ method in our class. Simplified as follows
Improved Simplified Iterator
import time
from collections.abc import Iterable, Iterator
class Classmate(object):
def __init__(self):
self.names = list()
self.num = 0
def add(self, name):
self.names.append(name)
def __iter__(self):
return self
def __next__(self):
# return self.obj.names[0]
try:
ret = self.names[self.num]
self.num += 1
return ret
except IndexError as e:
raise StopIteration
def main():
classmate = Classmate()
classmate.add("Zhang San")
classmate.add("Li Si")
classmate.add("Wang Wu")
for i in classmate:
print(i)
time.sleep(1)
if __name__ == '__main__':
main()
Improved simplified iterator.py
Application of Iterator
The role of iterators
- Without iterators, data is generated and stored before doing something, which may take up a lot of space when storing data.
- Using iterator is to master the method of data generation, when to use and when to generate.
- For example, range(10) generates 10 data in real time. What about range (1000000 000 000)?
- range: generate a list of 10 values; xrange: store the way to generate 10 values;
- In Python 2, range(10) stores a list, while xrange(10) stores an iterator that generates 10 values.
- range() in Python 3 is equivalent to xrange() in Python 2, and there is no xrange() in Python 3.
- Iterator is the way to store the generated data, not the result of the data.
Python 3 uses range:
>>> range(10)
range(0, 10)
>>> ret = range(10)
>>> next(ret)
Traceback (most recent call last):
File "<pyshell#3>", line 1, in <module>
next(ret)
TypeError: 'range' object is not an iterator
>>> for i in range(10):
print(i)
0
1
2
3
...
Normal Implementation of Fibonacci Sequence
nums = []
a = 0
b = 1
i = 0
while i < 10:
nums.append(a)
a, b = b, a+b
i += 1
for i in nums:
print(i)
Using Iterator to Realize Fibonacci Sequence
class Fibonacci(object):
def __init__(self, times):
self.times = times
self.a = 0
self.b = 1
self.current_num = 0
def __iter__(self):
return self
def __next__(self):
if self.current_num < self.times:
ret = self.a
self.a, self.b = self.b, self.a+self.b
self.current_num += 1
return ret
else:
raise StopIteration
fibo = Fibonacci(10)
for i in fibo:
print(i)
Using Iterator to Realize Fibonacci Sequence
When to adjust and when to generate.
Other ways that iterators use - type conversions such as list tuples
When we use list() or tuple() for type conversion, we also use iterators.
a = (11,22,33)
b = list(a)
When we use list() to convert tuples into lists, we use the principle of iterator. First, we define an empty list and add the first value from tuples to empty lists through _next_ by iterator. Then we take values from tuples and add them to lists in turn until there is no value in tuples, and actively throw an iteration stop exception.
Similarly, converting lists into tuples is the same.
generator
Iterator: A way to save memory and know how to generate data in the future.
Generator: A special iterator;
Generator mode:
- 1. Replace the parentheses of the list derivation with the middle parentheses.
- 2. Use yield in functions
Implementation Generator Mode 1
In [15]: L = [ x*2 for x in range(5)]
In [16]: L
Out[16]: [0, 2, 4, 6, 8]
In [17]: G = ( x*2 for x in range(5))
In [18]: G
Out[18]: <generator object <genexpr> at 0x7f626c132db0>
In [19]: next(G)
Out[19]: 0
In [20]: next(G)
Out[20]: 2
Implementation Generator Mode 2
Generator using yield
def Fibonacci(n):
a, b = 0, 1
count_num = 0
while count_num < n:
# If there is one in the function yield Statement, then this is no longer a function, but a generator template
yield a
a, b = b, a+b
count_num += 1
# If you find that there is one in this function when you call it yield,At this point, instead of calling a function, you create a generator object.
fb = Fibonacci(5)
print("Use for All the numbers in the loop traversal generator".center(40, "-"))
for i in fb:
print(i)
Completion of Fibonacci Sequences with Field
Generator execution process: When the first call for/next execution is made, the first line of the generator will be executed downwards, until yield is encountered in the loop, the variable/character after yield will be returned; then when the second call for/next, the code after the last yield will continue to execute until yield is encountered again in the loop and returned; and then downward, until no yield is found. There is data.
You can use for-in generator objects to traverse the data in the generator, or you can use next (generator objects) to get the values in the generator one by one.
Use next to get the values in the generator
def Fibonacci(n):
a, b = 0, 1
count_num = 0
while count_num < n:
# If there is one in the function yield Statement, then this is no longer a function, but a generator template
yield a
a, b = b, a+b
count_num += 1
# If you find that there is one in this function when you call it yield,At this point, instead of calling a function, you create a generator object.
fb = Fibonacci(5)
print("Use next Generate three digits in turn".center(40, "-"))
print(next(fb))
print(next(fb))
print(next(fb))
print("Use for Loop through the remaining numbers".center(40, "-"))
for i in fb:
print(i)
Use next to get the values in the generator
Generator-send mode
Multiple generators can be created repeatedly, and there is no interference between generators.
If there is a return value in the generator, it can be received with an error result. value at the end of the generator.
def Fibonacci(n):
a, b = 0, 1
count_num = 0
while count_num < n:
# If there is one in the function yield Statement, then this is no longer a function, but a generator template
yield a
a, b = b, a+b
count_num += 1
return "okhaha"
# If you find that there is one in this function when you call it yield,At this point, instead of calling a function, you create a generator object.
fb = Fibonacci(5)
while 1:
try:
result = next(fb)
print(result)
except Exception as e:
print(e.value)
break
Generators use send
In addition to using next to start the generator, send can also be used to start the generator.
def Fibonacci(n):
a, b = 0, 1
count_num = 0
while count_num < n:
ret = yield a
print("ret:", ret)
a, b = b, a+b
count_num += 1
fb = Fibonacci(5)
print(next(fb))
print(fb.send("haha"))
print(next(fb))
# 0
# ret: haha
# 1
# ret: None
# 1
Use send to start the generator
We can understand that the first time we use next, we first execute the code on the right side of the equal sign, and then return yield a to next(fb); and the next time we call send, we execute the code on the left side of the equal sign, assign the value of send to ret, and then execute the subsequent code.
Or we can understand that ret = yield a is two steps ==> 1. yield a; 2.ret = arg; where Arg represents the value of send, if not, it defaults to None, so when next is called after send, it defaults to pass None;
Note that send is not generally used as the first wake-up generator. If you must use send for the first wake-up, send(None);
Generator - Summary
Generator features:
- A special iterator without _iter_ and _next_ methods;
- The function returns only part of the execution.
- You can pause a function, save the last value, restore the previous value to its original form, and then do the next operation.
- Iterator saves space and realizes cycle.
- The generator can pause a code that looks like a function and call next/send to continue execution according to its own idea.
Multitasking with yield
- In Python 2, the execution time of while 1 is about 2/3 of while True, because True is not a key word in 2 and can be assigned freely, so while 1 is used.
- In Python 3, True is already the key word, and the interpreter does not need to judge the value of True, so while True and while 1 are not very different, but may still be 1 faster;
Switching tasks between processes takes up a lot of resources. It takes a lot of time to create and release processes. The efficiency of processes is not as high as that of threads, and it takes less resources than threads.
Multitasking with yield
import time
def task_1():
while 1:
print("---1---")
time.sleep(0.5)
yield
def task_2():
while 1:
print("---2---")
time.sleep(0.5)
yield
def main():
t1 = task_1()
t2 = task_2()
while 1:
next(t1)
next(t2)
if __name__ == "__main__":
main()
Multitasking with yield
It is a false multi-task and belongs to concurrency.
Posted by Kitkat on Mon, 14 Oct 2019 23:48:58 -0700