What exactly is Python's iterator protocol?
It's located here in the docs:
One method needs to be defined for container objects to provide iteration support:
container.__iter__()
Return an iterator object. The object is required to support the iterator protocol described below. If a container supports different types of iteration, additional methods can be provided to specifically request iterators for those iteration types. (An example of an object supporting multiple forms of iteration would be a tree structure which supports both breadth-first and depth-first traversal.) This method corresponds to the tp_iter
slot of the type structure for Python objects in the Python/C API.
The iterator objects themselves are required to support the following two methods, which together form the iterator protocol:
iterator.__iter__()
Return the iterator object itself. This is required to allow both containers and iterators to be used with the for and in statements. This method corresponds to the tp_iter
slot of the type structure for Python objects in the Python/C API.
iterator.__next__()
Return the next item from the container. If there are no further items, raise the StopIteration exception. This method corresponds to the tp_iternext
slot of the type structure for Python objects in the Python/C API.
What are iterator, iterable, and iteration?
Iteration is a general term for taking each item of something, one after another. Any time you use a loop, explicit or implicit, to go over a group of items, that is iteration.
In Python, iterable and iterator have specific meanings.
An iterable is an object that has an __iter__
method which returns an iterator, or which defines a __getitem__
method that can take sequential indexes starting from zero (and raises an IndexError
when the indexes are no longer valid). So an iterable is an object that you can get an iterator from.
An iterator is an object with a next
(Python 2) or __next__
(Python 3) method.
Whenever you use a for
loop, or map
, or a list comprehension, etc. in Python, the next
method is called automatically to get each item from the iterator, thus going through the process of iteration.
A good place to start learning would be the iterators section of the tutorial and the iterator types section of the standard types page. After you understand the basics, try the iterators section of the Functional Programming HOWTO.
How to build a basic iterator?
Iterator objects in python conform to the iterator protocol, which basically means they provide two methods: __iter__()
and __next__()
.
The
__iter__
returns the iterator object and is implicitly called
at the start of loops.The
__next__()
method returns the next value and is implicitly called at each loop increment. This method raises a StopIteration exception when there are no more value to return, which is implicitly captured by looping constructs to stop iterating.
Here's a simple example of a counter:
class Counter:
def __init__(self, low, high):
self.current = low - 1
self.high = high
def __iter__(self):
return self
def __next__(self): # Python 2: def next(self)
self.current += 1
if self.current < self.high:
return self.current
raise StopIteration
for c in Counter(3, 9):
print(c)
This will print:
3
4
5
6
7
8
This is easier to write using a generator, as covered in a previous answer:
def counter(low, high):
current = low
while current < high:
yield current
current += 1
for c in counter(3, 9):
print(c)
The printed output will be the same. Under the hood, the generator object supports the iterator protocol and does something roughly similar to the class Counter.
David Mertz's article, Iterators and Simple Generators, is a pretty good introduction.
The Iterator Protocol. Is it Dark Magic?
A list is iterable, but it is not an iterator. Compare and contrast:
>>> type([])
list
>>> type(iter([]))
list_iterator
Calling iter
on a list creates and returns a new iterator object for iterating the contents of that list.
In your object, you just return a list iterator, specifically an iterator over the list [2, 4, 6]
, so that object knows nothing about yielding elements 1, 2, 3.
def __iter__(self):
return iter([2, 4, 6]) # <-- you're returning the list iterator, not your own
Here's a more fundamental implementation conforming to the iterator protocol in Python 2, which doesn't confuse matters by relying on list iterators, generators, or anything fancy at all.
class Iter(object):
def __iter__(self):
self.val = 0
return self
def next(self):
self.val += 1
if self.val > 3:
raise StopIteration
return self.val
What makes something iterable in python
To make a class iterable, write an __iter__()
method that returns an iterator:
class MyList(object):
def __init__(self):
self.list = [42, 3.1415, "Hello World!"]
def __iter__(self):
return iter(self.list)
m = MyList()
for x in m:
print(x)
prints
42
3.1415
Hello World!
The example uses a list iterator, but you could also write your own iterator by either making __iter__()
a generator or by returning an instance of an iterator class that defines a __next__()
method.
How does __iter__ work?
An iterator needs to define two methods: __iter__()
and __next__()
(next()
in python2). Usually, the object itself defines the __next__()
or next()
method, so it just returns itself as the iterator. This creates an iterable that is also itself an iterator. These methods are used by for
and in
statements.
Python 3 docs: docs.python.org/3/library/stdtypes.html#iterator-types
Python 2 docs: docs.python.org/2/library/stdtypes.html#iterator-types
Related Topics
Finding Index of Nearest Point in Numpy Arrays of X and Y Coordinates
Assigning String with Boolean Expression
Matplotlib - Add Colorbar to a Sequence of Line Plots
Python Script Execute Commands in Terminal
Why Doesn't Django's Model.Save() Call Full_Clean()
Iterating Through Two Lists in Django Templates
Is Generator.Next() Visible in Python 3
How Can a Recursive Regexp Be Implemented in Python
How to Erase the File Contents of Text File in Python
Python CSV.Reader: How to Return to the Top of the File
How to Get Dict from SQLite Query
How to Loop Through a List by Twos
Access Data in Package Subdirectory
How to Call Python Code from C Code