SoFunction
Updated on 2024-10-30

Python functional programming itertools module in detail

Containers and Iterable Objects

Before we get started, let's add some basic concepts. In Python, there are containers and iterable objects.

  • Container: A data structure used to store multiple elements, such as lists, tuples, dictionaries, collections, and so on;
  • Iterable objects: An object that implements the __iter__ method is called an iterable object.

Iterators and generators are also derived from iterable objects:

  • Iterators: Objects that implement both the __iter__ and __next__ methods are called iterators;
  • Generators: Functions with the yield keyword are generators.

This makes it clearer that the scope of the iterable object is larger than the container.And iterable objects can only be used onceIf the value is used up and then retrieved, a StopIteration exception is thrown.

Beyond that, iterable objects have some limitations:

  • The len function cannot be used on iterable objects;
  • Iterable objects can be handled using the next method, and containers can be converted to iterators using the iter function;
  • The for statement automatically calls the container's iter function, so the container can also be iterated through the loop.

count() function

The count function is generally compared with the range function to learn, for example, the range function needs to define the lower limit of the generated range, the upper limit and the optional step, while the count function is different, specify the lower limit and the step, the upper limit value does not need to declare.

The function prototype is declared as follows

count(start=0, step=1) --> count object

The test code is as follows, which must be added to jump out of the loop conditions, otherwise the code will keep running.

from itertools import count
a = count(5, 10)
for i in a:
    print(i)
    if i > 100:
        break

In addition, the count function takes non-integer arguments, so the definition in the following code is also correct.

from itertools import count
a = count(0.5, 0.1)
for i in a:
    print(i)
    if i > 100:
        break

cycle function

The cycle function can be used to cycle through a set of values, as shown in the test code below:

from itertools import cycle
x = cycle('Dream Eraser abcdf')
for i in range(5):
    print(next(x), end=" ")
print("\n")
print("*" * 100)
for i in range(5):
    print(next(x), end=" ")

The code outputs the following:

Dream Rubber Rub

****************************************************************************************************
a b c d f

You can see that the cycle function is very similar to a for loop.

repeat function

The repeat function is used to repeat a value, the official description of the function is shown below:

class repeat(object):
    """
    repeat(object [,times]) -> create an iterator which returns the object
    for the specified number of times.  If not specified, returns the object
    endlessly.

Perform a simple test to see the results:

from itertools import repeat
x = repeat('Eraser')
for i in range(5):
    print(next(x), end=" ")
print("\n")
print("*" * 100)
for i in range(5):
    print(next(x), end=" ")

Any way you look at this function, it doesn't seem to be very useful.

enumerate function, add serial number

This function in the previous article, has been briefly introduced, and the function in the __builtins__ package, so do not explain too much, the basic format is shown below:

enumerate(sequence, [start=0])

The start parameter is the start position of the subscript.

accumulate function

The function returns an iterable object based on the given function, which is cumulative effect by default, i.e., the second parameter is , and the test code is as follows:

from itertools import accumulate
data = [1, 2, 3, 4, 5]
# Calculate the cumulative sum
print(list(accumulate(data)))  # [1, 3, 6, 10, 15]

For the above code, modify to cumulative.

from itertools import accumulate
import operator
data = [1, 2, 3, 4, 5]
# Calculate cumulative
print(list(accumulate(data, )))

In addition, the second argument can be a function such as max, min, etc., as in the following code:

from itertools import accumulate
data = [1, 4, 3, 2, 5]
print(list(accumulate(data, max)))

The code outputs the following, which actually compares any two values inside the data and leaves the largest value.

[1, 4, 4, 4, 5]

chain and groupby functions

The chain function is used to combine multiple iterators into a single iterator, while groupby can divide an iterator into multiple subiterators.

Let's start by showing the application of the groupby function:

from itertools import groupby
a = list(groupby('Eraser Skin Eraser'))
print(a)

The output is shown below:

[('Quercus serrata', <itertools._grouper object at 0x0000000001DD9438>),
('pico- (one trillionth)', <itertools._grouper object at 0x0000000001DD9278>),
('rubbing (brush stroke in painting)', <itertools._grouper object at 0x00000000021FF710>)]

In order to use the groupby function, it is recommended to sort the original list first, as it is a bit like slicing and separating out an iterator if you find different ones.

The chain function is used as follows to stitch together multiple iteration objects:

from itertools import groupby, chain
a = list(chain('ABC', 'AAA', range(1,3)))
print(a)

zip_longest vs.

zip function in the previous blog has been explained, zip_longest and zip the difference is that zip returns the results of the shortest sequence shall prevail, while zip_longest to the longest prevail.

The test code is below, just compare the results yourself.

from itertools import zip_longest
a = list(zip('ABC', range(5), [10, 20, 30, 40]))
print(a)
a = list(zip_longest('ABC', range(5), [10, 20, 30, 40]))
print(a)

zip_logest If a sequence of inconsistent lengths is encountered, the missing part is filled with None.

tee function

The tee function can clone an iterable object to produce multiple generators, each of which can produce individual elements of the input.

from itertools import tee
a = list(tee('Eraser'))
print(a)

compress function

This function determines the trade-off problem for an element by means of the **predicate (whether, True/False)**, the simplest code of which is shown below:

from itertools import compress
a = list(compress('Eraser', (0, 1, 1)))
print(a)

islice、dropwhile、takewhile、filterfalse、filter

Each of these functions takes a subset from the input iterable and does not modify the elements themselves.

This section only lists the prototype declaration of each function, the specific use of direct reference to use.

islice(iterable, stop) --> islice object
islice(iterable, start, stop[, step]) --> islice object
dropwhile(predicate, iterable) --> dropwhile object
takewhile(predicate, iterable) --> takewhile object
filterfalse(function or None, sequence) --> filterfalse object

Only the argument in filterfalse is preceded by the function and followed by the sequence.

The test code is as follows, note in particular that the first argument is a callable, i.e. a function.

from itertools import islice, dropwhile, takewhile, filterfalse
a = list(filterfalse(lambda x: x in ["Pi.", "Rub."], 'Eraser'))
print(a)

summarize

The above content is the entire content of this article, in the use of infinite iterator function count, cycle, repeat, must pay attention to even stop.

That's all for this post, I hope it helps you and I hope you'll check back for more from me!