About me
A thoughtful programmer ape, a lifelong learning practitioner, currently works as a team lead er in an entrepreneurship team. The technology stack involves Android, Python, Java and Go, which is also the main technology stack of our team.
Github: https://github.com/hylinux1024
Wechat Public Number: Angrycode
Iterable, Iterator and Generator are common concepts in Python. These concepts are often confused at first. Now is the time to clarify these concepts.
0x00 Iterable
Simply put, an object (everything in Python is an object) as long as the _iter_() method is implemented, then the Iterable object is checked with the isinstance() function.
for example
class IterObj: def __iter__(self): # Here we simply go back to ourselves. # But that may not be the case. # It's implemented through built-in iterative objects # The following column will show you return self
It defines a class IterObj and implements the _iter_() method, which is an Iterable object.
it = IterObj() print(isinstance(it, Iterable)) # true print(isinstance(it, Iterator)) # false print(isinstance(it, Generator)) # false
Keep this class in mind, and we'll see the definition of this class later.
Common Iterable Objects
What are the common iterative objects in Python?
- Collection or sequence types (such as list, tuple, set, dict, str)
- File object
- The object of the _iter_() method defined in the class can be considered as an Iterable object, but in order for the user-defined iterative object to be used correctly in the for loop, it is necessary to ensure that the _iter_() implementation must be correct (that is, it can be converted into an Iterator object through the built-in iter() function. As you'll see below about Iterator, there's a pit left, just remember that iter() functions are capable of converting an iterator object into an iterator object, and then use it for)
- In the class, if only _getitem_() is implemented, the object can be transformed into an iterator through the iter() function, but it is not an iteratable object itself. So when an object can run in a for loop, it's not necessarily an Iterable object.
With regard to points 1 and 2, we can verify them by the following.
print(isinstance([], Iterable)) # true list is iterative print(isinstance({}, Iterable)) # true dictionary is iterative print(isinstance((), Iterable)) # true tuples are iterative print(isinstance(set(), Iterable)) # true set is iterative print(isinstance('', Iterable)) # true strings are iterative currPath = os.path.dirname(os.path.abspath(__file__)) with open(currPath+'/model.py') as file: print(isinstance(file, Iterable)) # true
Let's look at point 3 again.
print(hasattr([], "__iter__")) # true print(hasattr({}, "__iter__")) # true print(hasattr((), "__iter__")) # true print(hasattr('', "__iter__")) # true
These built-in sets or sequence objects all have _iter_ attributes, that is, they all implement the same name method. But if this iteratable object is to be used in a for loop, it should be able to be called by the built-in iter() function and converted into an Iterator object.
For example, let's look at built-in iterative objects
print(iter([])) # <list_iterator object at 0x110243f28> print(iter({})) # <dict_keyiterator object at 0x110234408> print(iter(())) # <tuple_iterator object at 0x110243f28> print(iter('')) # <str_iterator object at 0x110243f28>
They are all converted into corresponding Iterator objects.
Now look back at the IterObj class that was first defined
class IterObj: def __iter__(self): return self it = IterObj() print(iter(it))
We use the iter() function, which prints out the following information on the console:
Traceback (most recent call last): File "/Users/mac/PycharmProjects/iterable_iterator_generator.py", line 71, in <module> print(iter(it)) TypeError: iter() returned non-iterator of type 'IterObj'
A type error occurred, meaning that the iter() function cannot convert a'non-iterator'type to an iterator.
So how can I turn an Iterable object into an Iterator object?
Let's modify the definition of the IterObj class
class IterObj: def __init__(self): self.a = [3, 5, 7, 11, 13, 17, 19] def __iter__(self): return iter(self.a)
We define a list named a in the construction method, and then implement the _iter_() method.
Modified classes can be called by iter() functions, that is, they can also be used in for loops.
it = IterObj() print(isinstance(it, Iterable)) # true print(isinstance(it, Iterator)) # false print(isinstance(it, Generator)) # false print(iter(it)) # <list_iterator object at 0x102007278> for i in it: print(i) # Print 3, 5, 7, 11, 13, 17, 19 elements
Therefore, when defining an Iterable object, we should pay great attention to the internal implementation logic of the _iter_() method. Generally, it is assisted by some known Iterable objects (e.g., set, sequence, file, etc., or other correctly defined Iterable objects mentioned above).
The meaning of point 4 is that the iter() function can convert an object that implements the getitem () method into an iterator object or can be used in a for loop, but it is not an iteratable object when detected by the isinstance() method.
class IterObj: def __init__(self): self.a = [3, 5, 7, 11, 13, 17, 19] def __getitem__(self, i): return self.a[i] it = IterObj() print(isinstance(it, Iterable)) # false print(isinstance(it, Iterator)) # false print(isinstance(it, Generator)) false print(hasattr(it, "__iter__")) # false print(iter(it)) # <iterator object at 0x10b231278> for i in it: print(i) # Print out 3, 5, 7, 11, 13, 17, 19
This example illustrates that objects that can be used in for are not necessarily iterative objects.
Now let's make a summary:
- An iterative object is an object that implements the _iter_() method.
- To use it in the for loop, it must satisfy the call of iter() (that is, calling this function is error-free and can be correctly converted into an Iterator object)
- We can use known iteratable objects to assist in the implementation of our custom iteratable objects.
- An object implements the _getitem_() method, which can be converted to Iterator by iter() function, that is, it can be used in for loop, but it is not an iterative object (it can be detected by isinstance method).
0x01 Iterator
Iterator has been mentioned in many places above. Now let's fill the pit.
When we understand the concept of iteration, we have a better understanding of iterators.
An object implements _iter_() and _next_() methods, so it is an iterator object. for example
class IterObj: def __init__(self): self.a = [3, 5, 7, 11, 13, 17, 19] self.n = len(self.a) self.i = 0 def __iter__(self): return iter(self.a) def __next__(self): while self.i < self.n: v = self.a[self.i] self.i += 1 return v else: self.i = 0 raise StopIteration()
In IterObj, the constructor defines a list a, list length n, index i.
it = IterObj() print(isinstance(it, Iterable)) # true print(isinstance(it, Iterator)) # true print(isinstance(it, Generator)) # false print(hasattr(it, "__iter__")) # true print(hasattr(it, "__next__")) # true
We can find the above mentioned.
Sets and sequence objects are iterative but not iterators
print(isinstance([], Iterator)) # false print(isinstance({}, Iterator)) # false print(isinstance((), Iterator)) # false print(isinstance(set(), Iterator)) # false print(isinstance('', Iterator)) # false
The file object is an iterator
currPath = os.path.dirname(os.path.abspath(__file__)) with open(currPath+'/model.py') as file: print(isinstance(file, Iterator)) # true
An Iterator object can be used not only in a for loop, but also through the built-in function next(). for example
it = IterObj() next(it) # 3 next(it) # 5
0x02 Generator
Now let's see what a generator is.
A generator is an iterator as well as an iterator
There are two ways to define a generator:
- List Generator
- Define generator functions with yield
Look at the first case first.
g = (x * 2 for x in range(10)) # Even Generator of 0-18 print(isinstance(g, Iterable)) # true print(isinstance(g, Iterator)) # true print(isinstance(g, Generator)) # true print(hasattr(g, "__iter__")) # true print(hasattr(g, "__next__")) # true print(next(g)) # 0 print(next(g)) # 2
List generators can generate a huge list without consuming a lot of memory, and only compute when data is needed.
Look at the second scenario.
def gen(): for i in range(10): yield i
Here the function of yield is equivalent to return. This function is to return the natural numbers between [0,10] in sequence, which can be traversed by next() or by using a for loop.
When the program encounters the yield keyword, the generator function returns. Until the next() function is executed again, it will continue to execute from the execution point returned by the last function. That is, when the yield exits, it saves the location, variables and other information of the function's execution, and when it executes again, it will continue to persist from the place where the yield exits. That's ok.
In Python, these features of the generator can be used to realize the process. A coroutine can be understood as a lightweight thread, which has many advantages over threads in dealing with high concurrency scenarios.
Look at the following producer-consumer model implemented with a collaborative process
def producer(c): n = 0 while n < 5: n += 1 print('producer {}'.format(n)) r = c.send(n) print('consumer return {}'.format(r)) def consumer(): r = '' while True: n = yield r if not n: return print('consumer {} '.format(n)) r = 'ok' if __name__ == '__main__': c = consumer() next(c) # Start consumer producer(c)
This code executes as follows
producer 1 consumer 1 producer return ok producer 2 consumer 2 producer return ok producer 3 consumer 3 producer return ok
The protocol achieves the effect of concurrency by switching the CPU between two functions.