English | 简体中文 | 繁體中文 | Русский язык | Français | Español | Português | Deutsch | 日本語 | 한국어 | Italiano | بالعربية
In this article, you will learn how to easily create iteration using Python generators, how they differ from iterators and regular functions, and why they should be used.
with PythonconstructionIteratorsThere are many overheads; we must implement a class using __iter__() and __next__() methods, track the internal state, trigger StopIteration when there are no values to return, and so on.
This is both lengthy and counterintuitive. Generators can come in handy in this case.
Python generators are a simple way to create iterators. All the overhead mentioned above is automatically handled by Python's generators.
In short, a generator is a function that returns an object (an iterator) that we can iterate over (one value at a time).
Creating a generator in Python is very simple. Just as easy as defining a normal function using the yield statement instead of the return statement.
If a function contains at least one yield statement (it may contain other yield or return statements), then it becomes a generator function. Both yield and return return some value from the function.
The difference is that when the return statement completely terminates a function, the yield statement pauses the function to save all its state, and then continues to execute on subsequent calls.
This is the difference betweenRegular functionsDifferences.
Generator functions contain one or more yield statements.
When called, it returns an object (an iterator) but does not start execution immediately.
Methods like __iter__() and __next__() are automatically implemented. Therefore, we can use next() to traverse items.
Once the function produces a result, the function pauses and control is transferred to the caller.
Local variables and their state are remembered between consecutive calls.
Finally, when the function terminates, StopIteration is automatically raised on further calls.
This is an example to illustrate all the above points. We have a generator function named my_gen() with several yield statements.
# A simple generator function def my_gen(): n = 1 print('This is the first print') # The generator function contains a yield statement yield n n += 1 print('This is the second print') yield n n += 1 print('This is the last print') yield n
The interactive execution in the interpreter is as shown below. Run these commands in the Python Shell to view the output.
>>> # It returns an object but does not start executing immediately. >>> a = my_gen() >>> # We can use next() to traverse these items. >>> next(a) This is the first print 1 >>> # Once the function produces a result, the function pauses and control is transferred to the caller. >>> # Local variables and their state are remembered between consecutive calls. >>> next(a) This is the second print 2 >>> next(a) This is the last print 3 >>> # Finally, when the function terminates, StopIteration will be automatically raised on further calls. >>> next(a) Traceback (most recent call last): ... StopIteration >>> next(a) Traceback (most recent call last): ... StopIteration
Something interesting to note in the above example is that the variable is remembered between callsnof value.
Different from ordinary functions, local variables are not destroyed when the function is generated. In addition, generator objects can only be iterated once.
To restart the process, we need to use something like = my_gen() to create another generator object.
Note:The last thing to note is that we can directly use the generator withfor looptogether.
This is because, the for loop accepts an iterator and iterates over it using the next() function. It automatically ends when StopIteration is triggered.Learn how to implement for loops in actual Python.
# A simple generator function def my_gen(): n = 1 print('This is the first print') # The generator function contains a yield statement yield n n += 1 print('This is the second print') yield n n += 1 print('This is the last print') yield n # Using for loop for item in my_gen(): print(item)
When running the program, the output is:
This is the first print 1 This is the second print 2 This is the last print 3
The above example is not very useful, we study it just to understand what is happening in the background.
通常,生成器函数是通过具有适当终止条件的循环来实现的。
Let's take the generator example of reversing a string.
def rev_str(my_str): length = len(my_str) for i in range(length - 1,-1,-1): yield my_str[i] # For loop to reverse the string # Output: # o # l # l # e # h for char in rev_str("hello"): print(char)
In this example, we use the range() function and the for loop to get the index in reverse order.
It turns out that this generator function is not only suitable for strings but also for other types of iterable objects, such aslist,tupleetc.
Using generator expressions, it is easy to dynamically create simple generators. It makes building generators easy.
created bythe same as anonymous functions,generator expressions create anonymous generator functions.
The syntax of generator expressions is similar toPythoninList comprehensionSyntax. But replace the brackets with parentheses.
The main difference between list comprehensions and generator expressions is that, while list comprehensions generate the entire list, generator expressions generate one item at a time.
They are a bit lazy and only generate items when needed. For this reason, generator expressions are much more memory efficient than equivalent list comprehensions.
# Initialize list my_list = [1, 3, 6, 10] # Use list comprehension to square each item # Output: [1, 9, 36, 100] [x**2 for x in my_list] # The same thing can be done using a generator expression # Output: <generator object <genexpr> at 0x0000000002EBDAF8> (x**2 for x in my_list)
As we can see above, the generator expression does not immediately produce the required result. Instead, it returns a generator object that produces items on demand.
# Initialize list my_list = [1, 3, 6, 10] a = (x**2 for x in my_list) # Output: 1 print(next(a)) # Output: 9 print(next(a)) # Output: 36 print(next(a)) # Output: 100 print(next(a)) # Output: StopIteration next(a)
Generator expressions can be used within functions. When used in this way, parentheses can be omitted.
>>> sum(x**2 for x in my_list) 146 >>> max(x**2 for x in my_list) 100
There are several reasons why generators are an attractive implementation.
Generators can be implemented in a clear and concise manner compared to their corresponding iterator classes. Here is an example of implementing it using the iterator class.2an example of a power sequence.
class PowTwo: def __init__(self, max = 0): self.max = max def __iter__(self): self.n = 0 return self def __next__(self): if self.n > self.max: raise StopIteration result = 2 ** self.n self.n += 1 return result
This code is very long. Now, perform the same operation using the generator function.
def PowTwoGen(max = 0): n = 0 while n < max: yield 2 ** n n += 1
Since generators automatically track details, they are concise and clear, and the implementation is also more concise.
A common function that returns a sequence creates the entire sequence in memory before returning the result. If the number of items in the sequence is large, it can affect efficiency.
The generator implementation for this sequence is memory-friendly and therefore the preferred choice, as it can only generate one item at a time.
Generators are an excellent medium for representing infinite data streams. Infinite streams cannot be stored in memory, and since generators produce one item at a time, they can represent infinite data streams.
The following example can generate all even numbers (at least in theory).
def all_even(): n = 0 while True: yield n n += 2
Generators can be used for pipelining a series of operations. It's best to illustrate with an example.
Assuming we have a log file from a famous fast-food chain. The log file has a column (the4Column), which tracks the number of pizzas sold per hour, we want to sum it up to get5Total number of pizzas sold in the year.
Assuming all content is strings and no available numbers are marked as "N / A. Implementation of Generators Can Be as Follows.
with open('sells.log') as file: pizza_col = (line[3] for line in file) per_hour = (int(x) for x in pizza_col if x != 'N/A') print("Total pizzas sold = ", sum(per_hour))
This pipeline is efficient and easy to read (yes, it's very cool!).