What Exactly Is Python's Iterator Protocol

What exactly is Python's iterator protocol?

It's located here in the docs:

One method needs to be defined for container objects to provide iteration support:

container.__iter__()

Return an iterator object. The object is required to support the iterator protocol described below. If a container supports different types of iteration, additional methods can be provided to specifically request iterators for those iteration types. (An example of an object supporting multiple forms of iteration would be a tree structure which supports both breadth-first and depth-first traversal.) This method corresponds to the tp_iter slot of the type structure for Python objects in the Python/C API.

The iterator objects themselves are required to support the following two methods, which together form the iterator protocol:

iterator.__iter__()

Return the iterator object itself. This is required to allow both containers and iterators to be used with the for and in statements. This method corresponds to the tp_iter slot of the type structure for Python objects in the Python/C API.

iterator.__next__()

Return the next item from the container. If there are no further items, raise the StopIteration exception. This method corresponds to the tp_iternext slot of the type structure for Python objects in the Python/C API.

What are iterator, iterable, and iteration?

Iteration is a general term for taking each item of something, one after another. Any time you use a loop, explicit or implicit, to go over a group of items, that is iteration.

In Python, iterable and iterator have specific meanings.

An iterable is an object that has an __iter__ method which returns an iterator, or which defines a __getitem__ method that can take sequential indexes starting from zero (and raises an IndexError when the indexes are no longer valid). So an iterable is an object that you can get an iterator from.

An iterator is an object with a next (Python 2) or __next__ (Python 3) method.

Whenever you use a for loop, or map, or a list comprehension, etc. in Python, the next method is called automatically to get each item from the iterator, thus going through the process of iteration.

A good place to start learning would be the iterators section of the tutorial and the iterator types section of the standard types page. After you understand the basics, try the iterators section of the Functional Programming HOWTO.

How to build a basic iterator?

Iterator objects in python conform to the iterator protocol, which basically means they provide two methods: __iter__() and __next__().

  • The __iter__ returns the iterator object and is implicitly called
    at the start of loops.

  • The __next__() method returns the next value and is implicitly called at each loop increment. This method raises a StopIteration exception when there are no more value to return, which is implicitly captured by looping constructs to stop iterating.

Here's a simple example of a counter:

class Counter:
def __init__(self, low, high):
self.current = low - 1
self.high = high

def __iter__(self):
return self

def __next__(self): # Python 2: def next(self)
self.current += 1
if self.current < self.high:
return self.current
raise StopIteration

for c in Counter(3, 9):
print(c)

This will print:

3
4
5
6
7
8

This is easier to write using a generator, as covered in a previous answer:

def counter(low, high):
current = low
while current < high:
yield current
current += 1

for c in counter(3, 9):
print(c)

The printed output will be the same. Under the hood, the generator object supports the iterator protocol and does something roughly similar to the class Counter.

David Mertz's article, Iterators and Simple Generators, is a pretty good introduction.

The Iterator Protocol. Is it Dark Magic?

A list is iterable, but it is not an iterator. Compare and contrast:

>>> type([])
list
>>> type(iter([]))
list_iterator

Calling iter on a list creates and returns a new iterator object for iterating the contents of that list.

In your object, you just return a list iterator, specifically an iterator over the list [2, 4, 6], so that object knows nothing about yielding elements 1, 2, 3.

def __iter__(self):
return iter([2, 4, 6]) # <-- you're returning the list iterator, not your own

Here's a more fundamental implementation conforming to the iterator protocol in Python 2, which doesn't confuse matters by relying on list iterators, generators, or anything fancy at all.

class Iter(object):

def __iter__(self):
self.val = 0
return self

def next(self):
self.val += 1
if self.val > 3:
raise StopIteration
return self.val

What makes something iterable in python

To make a class iterable, write an __iter__() method that returns an iterator:

class MyList(object):
def __init__(self):
self.list = [42, 3.1415, "Hello World!"]
def __iter__(self):
return iter(self.list)

m = MyList()
for x in m:
print(x)

prints

42
3.1415
Hello World!

The example uses a list iterator, but you could also write your own iterator by either making __iter__() a generator or by returning an instance of an iterator class that defines a __next__() method.

How does __iter__ work?

An iterator needs to define two methods: __iter__() and __next__() (next() in python2). Usually, the object itself defines the __next__() or next() method, so it just returns itself as the iterator. This creates an iterable that is also itself an iterator. These methods are used by for and in statements.

  • Python 3 docs: docs.python.org/3/library/stdtypes.html#iterator-types

  • Python 2 docs: docs.python.org/2/library/stdtypes.html#iterator-types



Related Topics



Leave a reply



Submit