Detailed explanation of python iterators, iteratable objects and generators

iteration

         Definition: iteration is a way to traverse the elements of a collection. An iterator is an object that can traverse the collection from front to back

iterator

definition:

         An iterator is an object that can remember the traversal position, and the corresponding class of the object must be an iterator class

         (remember that traversing the location means saving the current location after one call, and the next call will continue from the saved location)

characteristic:

         The iterator object is accessed from the first element of the collection until all the elements are accessed

         Iterators can only move forward, not backward

Usage:

Step 1: create an iterator class

Implement two functions in the class__ iter__ () and__ next__ (), so you can use this class as an iterator

  •  __ iter__ () function is used to return an iterator object (because the iterator itself is an iterator object, the function returns itself in the iterator class, that is, self. This is especially explained here because the iteratable object also uses the ___ () function, which will be discussed later)
  • __ next__ The () function returns the data of the next iteration. If the next data does not exist, a StopIteration exception is thrown

Step 2: create an iterator object according to class instantiation (different from the iteratable object):

You can iterate in two ways:

1. Variable name = object name__ next__ ()

2. Variable name = next (object name)

Let's look at an example of calculating the Fibonacci sequence

class FeiboIterator(object):
    # The Fibonacci sequence starts with 0 and 1, and the subsequent Fibonacci numbers are obtained by adding the previous two numbers
    def __init__(self, n):
        # Number of Fibonacci sequence values
        self.n = n
        # Record the subscript of the current traversal
        self.index = 0
        # The first two values of Fibonacci sequence
        self.num1 = 0
        self.num2 = 1

    def __iter__(self):
        # Returns an iterator object
        return self

    def __next__(self):
        # Returns the data for the next iteration
        if self.index < self.n:
            num = self.num1
            self.num1, self.num2 = self.num2, self.num1 + self.num2
            self.index += 1
            return num
        else:
            raise StopIteration


feib = FeiboIterator(10)  # Create an iterator object
fb = iter(feib)  # Calling the iter() function returns the iterator object


print(fb)  # View fb type
print(feib) # View feib types
"""fb and feib The return value is the same, so you don't have to call it separately after creating the iterator object iter()function"""
print(fb.__next__())    # The next value can be returned in both ways
print(fb.__next__())
print(fb.__next__())
print(fb.__next__())
print(next(fb))
print(next(fb))
print(next(fb))
print(next(fb))
print(next(fb))

  Such an iterative transformation is easy to associate with a for loop traversing an array

         The principle of the for loop is to execute the internal of the looped object first__ iter__ () function, obtain an iterator, and then continuously execute the value of the next() method in the iterator. This mode is also known as the iterator protocol of python

The following example still calculates the Fibonacci sequence, which is only executed through the for loop__ next__ (), which proves that the above conclusion is correct

class FeiboIterator(object):
    # The Fibonacci sequence starts with 0 and 1, and the subsequent Fibonacci numbers are obtained by adding the previous two numbers
    def __init__(self, n):
        # Number of Fibonacci sequence values
        self.n = n
        # Record the subscript of the current traversal
        self.index = 0
        # The first two values of Fibonacci sequence
        self.num1 = 0
        self.num2 = 1

    def __iter__(self):
        # Returns an iterator object
        return self

    def __next__(self):
        # Returns the data for the next iteration
        if self.index < self.n:
            num = self.num1
            self.num1, self.num2 = self.num2, self.num1 + self.num2
            self.index += 1
            return num
        else:
            raise StopIteration


fb = FeiboIterator(10)

for num in fb:  # Automatically return the iteratable object through for and call next()
    print(num)

So can we say that all objects that can be traversed by the for loop are iterators? No, they can also be iteratable objects

Iteratable object

definition:

         If a class has__ iter__ () function and returns an iterator object, or implements__ getitem __ Function and its parameters are indexed from 0, then we call the object created with this class an iteratable object

(I will only introduce the iter here. To learn more about the getitem function, please refer to the following links Another implementation of iteratable objects)

characteristic:

         Iteratable objects can be looped with for. In the loop, the _iter_ () function will be executed first to obtain its iterator object, and then the _next_ () function will be executed internally

Usage:

         The iteratable object must contain the _iter _ () function to return an iterator object

The following shows an iteratable object, but it can't be used normally because the _iter _ () function hasn't returned an iterator yet

class fb(object):
    def __init__(self, n):
        self.n = n

    def __iter__(self):
        pass

An iteratable object must contain a function return iterator

By combining the iterator with the iterator, the iteratable object can run normally

class fb(object):
    def __init__(self, n):
        self.n = n

    def __iter__(self):
        return FeiboIterator(self.n)


class FeiboIterator(object):
    # The Fibonacci sequence starts with 0 and 1, and the subsequent Fibonacci numbers are obtained by adding the previous two numbers
    def __init__(self, n):
        # Number of Fibonacci sequence values
        self.n = n
        # Record the subscript of the current traversal
        self.index = 0
        # The first two values of Fibonacci sequence
        self.num1 = 0
        self.num2 = 1

    def __iter__(self):
        # Returns an iterator object
        return self

    def __next__(self):
        # Returns the data for the next iteration
        if self.index < self.n:
            num = self.num1
            self.num1, self.num2 = self.num2, self.num1 + self.num2
            self.index += 1 
            return num
        else:
            raise StopIteration


fb = fb(10)

for num in fb:
    print(num)

Determine whether an object is an iteratable object or iterator:

Principle:

         If an object has _ iter () function and returns an iterator object, it is called an iteratable object; if an object has iter () and next ()  , Call it an iterator

method:

Method 1: dir (object name) to check whether there are _iter _ () and _next ()

Method 2: use the function in collection.abc

Note: iteratable is not accurate in judging the iteratable object, because if iteratable is used for iterators and iteratable objects, it will return true. It can be judged in combination with Iterator. When the return value of Iterator is true, it means that it is an Iterator rather than an iteratable object

The following code uses these two methods to determine whether the list is an iterator or an iteratable object. We can also use this method to determine the tuple of the dictionary collection

from collections.abc import Iterator, Iterable
import numpy as np

a1 = [1, 2, 3, 4]
a2 = a1.__iter__()

# View functions in class objects
print(dir(a1))
print(dir(a2))

# list
# Determine whether a1 is an iterator and can iterate objects
print("list")
print(isinstance(a1, Iterator))  # Judge whether a1 is an iterator based on whether there are iter() and next() functions
print(isinstance(a1, Iterable))  # Judge whether a1 is an iteratable object based on whether there is iter() and the iterator object is returned

# Determine whether a2 is an iterator and can iterate objects
print(isinstance(a2, Iterator))
print(isinstance(a2, Iterable))

# Iterable is not accurate to determine whether it is an iteratable object, because the iterator will also return true
# It can be judged in combination with Iterator. When the return value of Iterator is true, it indicates that it is an Iterator rather than an iteratable object

The final result is that the list of common data types in python, tuples, dictionaries, collections, and arrays in numpy are all iteratable objects

generator

definition:

        The mechanism of iteration and calculation in python is called generator

        python has two main methods to create generators: generator expressions and generator functions

         According to the object created by the generator class generator, _iter_ () and _next_ () are declared internally   Function, so it can also be said that the generator is a special iterator, so the generator also has the characteristics of iterators

Generator expression:  

        It is similar to list derivation, but it does not return the result of the whole list derivation at one time, but returns an object that produces the result

The way to change the list into a generator is very simple. Just replace "[]" with "()"

# Calculate the square of a random number
# list
x1 = [x1**2 for x1 in range(10)]
print(x1)  # The entire list is printed

# generator 
x2 = (x2**2 for x2 in range(10))
print(x2)  # A generator object is printed
for num in x2:  # The iterator protocol needs to be implemented with a for loop
    print(num)

Generator function:

         All functions with yield keyword are generator functions. Calling a generator function will return an iterator object

characteristic:

1. Different from ordinary functions, the generator is a function that returns iterators and can only be used for iterative operations

2.yield is similar to return. When an iteration encounters yield, it returns the value after yield

3. After a general function is called through the function name, execute the statement in the function, return a value and exit

    When the generator function is called through next(), it will return a value when it encounters yeild, and then it will be automatically hung in the background. It is equivalent to using the yield keyword to pause the function and retain the current state (such as the value of the internal variable of the function), so that the next time the function executes the next() method, it can continue to execute from the last execution place

Here are two codes for calculating Fibonacci sequence. The first is an ordinary function and the second is a generator

def feiboIterator(n):
    index, num1, num2 = 0, 0, 1
    while index < n:
        num = num1
        num1, num2 = num2, num1 + num2
        index = index + 1
        print(num)
    return 'done'    # return returns the value and closes the function

print(feiboIterator(10)) # Complete the entire function and end
def feiboIterator(n):
    index, num1, num2 = 0, 0, 1
    while index < n:
        num = num1
        yield num   # Every time you run here, you will temporarily stop and exit the function
        num1, num2 = num2, num1 + num2
        index = index + 1
    return 'done'


a = feiboIterator(10)
print(a)

while True:
    try:
        print(next(a))
    except StopIteration as e:
        print("Generator return value:", e.value)
        break   # The normal function will run directly until return, and the generator will pause when it encounters yield, and next will restart

The generator is equivalent to saving the algorithm. Each time the next() function is called, the value of the next element is calculated until the last element is calculated

Why do I need a generator

         The main reason is to reduce the use of memory. The generator will not complete all calculations at one time, but iteratively return the results of the current calculation. This is more conducive to processing large quantities of data and greatly reduces the amount of memory consumed

reference material:

https://docs.python.org/2/library/stdtypes.html#iterator-types

https://www.liaoxuefeng.com/wiki/1016959663602400/1017318207388128

https://segmentfault.com/a/1190000022615581

Posted by standalone on Sun, 19 Sep 2021 06:26:30 -0700