What are iterator, iterable, and iteration?
Iteration is a general term for taking each item of something, one after another. Any time you use a loop, explicit or implicit, to go over a group of items, that is iteration.
In Python, iterable and iterator have specific meanings.
An iterable is an object that has an __iter__
method which returns an iterator, or which defines a __getitem__
method that can take sequential indexes starting from zero (and raises an IndexError
when the indexes are no longer valid). So an iterable is an object that you can get an iterator from.
An iterator is an object with a next
(Python 2) or __next__
(Python 3) method.
Whenever you use a for
loop, or map
, or a list comprehension, etc. in Python, the next
method is called automatically to get each item from the iterator, thus going through the process of iteration.
A good place to start learning would be the iterators section of the tutorial and the iterator types section of the standard types page. After you understand the basics, try the iterators section of the Functional Programming HOWTO.
What is the difference between iterator and iterable and how to use them?
An Iterable
is a simple representation of a series of elements that can be iterated over. It does not have any iteration state such as a "current element". Instead, it has one method that produces an Iterator
.
An Iterator
is the object with iteration state. It lets you check if it has more elements using hasNext()
and move to the next element (if any) using next()
.
Typically, an Iterable
should be able to produce any number of valid Iterator
s.
Iterable and iterator
The csv.reader
object is its own iterator. This is a common practice for iterables which are single-pass (i.e. can only be run through once). We can confirm this by inspection.
>>> data
<_csv.reader object at 0x7fe5d4a057b0>
>>> iter(data)
<_csv.reader object at 0x7fe5d4a057b0> # Note: Same as above
>>> id(data)
140625091516336
>>> id(iter(data))
140625091516336 # Note: Same as above
>>> data is iter(data)
True
Compare this to something like a list, which is an iterable but is not itself an iterator.
>>> lst = [1, 2, 3]
>>> lst
[1, 2, 3]
>>> iter(lst)
<list_iterator object at 0x7fe5d59747f0> # Note: NOT the same as before
>>> lst is iter(lst)
False
This allows us to iterate over a list several times by calling iter(lst)
multiple times, since each call gives us a fresh iterator. But your csv.reader
object is single-pass, so we only have the one iterator to it.
In Python, every iterator is an iterable, but not every iterable is an iterator. From the glossary
Iterators are required to have an
__iter__()
method that returns the iterator object itself so every iterator is also iterable and may be used in most places where other iterables are accepted.
Iterator vs Iterable?
An iterator is an iterable, but an iterable is not necessarily an iterator.
An iterable is anything that has an __iter__
method defined - e.g. lists and tuples, as well as iterators.
Iterators are a subset of iterables whose values cannot all be accessed at the same time, as they are not all stored in memory at once. These can be generated using functions like map
, filter
and iter
, as well as functions using yield
.
In your example, map
returns an iterator, which is also an iterable, which is why both functions work with it. However, if we take a list for instance:
>>> lst = [1, 2, 3]
>>> list(lst)
[1, 2, 3]
>>> next(lst)
Traceback (most recent call last):
File "<pyshell#3>", line 1, in <module>
next(lst)
TypeError: 'list' object is not an iterator
we can see that next
complains, because the list, an iterable, is not an iterator.
Is an iterator also an iterable?
An iterable needs to implement an __iter__
method or a __getitem__
method:
An object can be iterated over with
for
if it implements__iter__()
or__getitem__()
.
An iterator needs a __iter__
method (that returns self
) and a __next__
method (I'm not 100% sure about the __next__
).
it is true that an iterator always has __iter__
method?
Yes!
This is also documented in the Data model:
object.__iter__(self)
This method is called when an iterator is required for a container. This method should return a new iterator object that can iterate over all the objects in the container. For mappings, it should iterate over the keys of the container.
Iterator objects also need to implement this method; they are required to return themselves. For more information on iterator objects, see Iterator Types.
(Emphasis mine)
As to your second question:
Is an iterator also an iterable?
Yes, because it has a __iter__
method.
Additional notes
Besides the formal implementation it's easy to check if something is iterable by just checking if iter()
can be called on it:
def is_iterable(something):
try:
iter(something)
except TypeError:
return False
else:
return True
Likewise it's possible to check if something is an iterator by checking if iter()
called on something returns itself:
def is_iterator(something):
try:
return iter(something) is something # it needs to return itself to be an iterator
except TypeError:
return False
But don't use them in development code, these are just for "visualization". Mostly you just iterator over something using for ... in ...
or if you need an iterator you use iterator = iter(...)
and then process the iterator by calling next(iterator)
until it throws a StopIteration
.
Why is Java's Iterator not an Iterable?
Because an iterator generally points to a single instance in a collection. Iterable implies that one may obtain an iterator from an object to traverse over its elements - and there's no need to iterate over a single instance, which is what an iterator represents.
Confusion about iterators and iterables in Python
The documentation is creating some confusion here, by re-using the term 'iterator'.
There are three components to the iterator protocol:
Iterables; things you can potentially iterate over and get their elements, one by one.
Iterators; things that do the iteration. Every time you want to step through all items of an iterable, you need one of these to keep track of where you are in the process. These are not re-usable; once you reach the end, that's it. For most iterables, you can create multiple indepedent iterators, each tracking position independently.
Consumers of iterators; those things that want to do something with the items.
A for
loop is an example of the latter, so #3. A for
loop uses the iter()
function to produce an iterator (#2 above) for whatever you want to loop over, so that "whatever" must be an iterable (#1 above).
range()
is an example of #1; it is iterable object. You can iterate over it multiple times, independently:
>>> r = range(5)
>>> r_iter_1 = iter(r)
>>> next(r_iter_1)
0
>>> next(r_iter_1)
1
>>> r_iter_2 = iter(r)
>>> next(r_iter_2)
0
>>> next(r_iter_1)
2
Here r_iter_1
and r_iter_2
are two separate iterators, and each time you ask for a next item they do so based on their own internal bookkeeping.
list()
is an example of both an iterable (#1) and a iteration consumer (#3). If you pass another iterable (#1) to the list()
call, a list object is produced containing all elements from that iterable. But list objects themselves are also iterables.
zip()
, in Python 3, takes in multiple iterables (#1), and is itself an iterator (#2). zip()
stores a new iterator (#2) for each of the iterables you gave it. Each time you ask zip()
for the next element, zip()
builds a new tuple with the next elements from each of the contained iterables:
>>> lst1, lst2 = ['foo', 'bar'], [42, 81]
>>> zipit = zip(lst1, lst2)
>>> next(zipit)
('foo', 42)
>>> next(zipit)
('bar', 81)
So in the end, list(zip(list1, list2))
uses both list1
and list2
as iterables (#1), zip()
consumes those (#3) when it itself is being consumed by the outer list()
call.
What exactly does iterable mean in Python? Why isn't my object which implements `__getitem__()` an iterable?
I think the point of confusion here is that, although implementing __getitem__
does allow you to iterate over an object, it isn't part of the interface defined by Iterable
.
The abstract base classes allow a form of virtual subclassing, where classes that implement the specified methods (in the case of Iterable
, only __iter__
) are considered by isinstance
and issubclass
to be subclasses of the ABCs even if they don't explicitly inherit from them. It doesn't check whether the method implementation actually works, though, just whether or not it's provided.
For more information, see PEP-3119, which introduced ABCs.
using
isinstance(e, collections.Iterable)
is the most pythonic way
to check if an object is iterable
I disagree; I would use duck-typing and just attempt to iterate over the object. If the object isn't iterable a TypeError
will be raised, which you can catch in your function if you want to deal with non-iterable inputs, or allow to percolate up to the caller if not. This completely side-steps how the object has decided to implement iteration, and just finds out whether or not it does at the most appropriate time.
To add a little more, I think the docs you've quoted are slightly misleading. To quote the iter
docs, which perhaps clear this up:
object must be a collection object which supports the iteration protocol (the
__iter__()
method), or it must support the sequence
protocol (the__getitem__()
method with integer arguments starting
at0
).
This makes it clear that, although both protocols make the object iterable, only one is the actual "iteration protocol", and it is this that isinstance(thing, Iterable)
tests for. Therefore we could conclude that one way to check for "things you can iterate over" in the most general case would be:
isinstance(thing, (Iterable, Sequence))
although this does also require you to implement __len__
along with __getitem__
to "virtually sub-class" Sequence
.
Related Topics
Tkinter Understanding Mainloop
How to Represent an 'Enum' in Python
Could Not Open Resource File, Pygame Error: "Filenotfounderror: No Such File or Directory."
How to Compare Floats For Almost-Equality in Python
How to Get a Cron Like Scheduler in Python
What Are the Differences Between Type() and Isinstance()
Adding a Method to an Existing Object Instance
Using Python'S Eval() Vs. Ast.Literal_Eval()
What Are Iterator, Iterable, and Iteration
How to Protect Python Code from Being Read by Users
How to Download a File Over Http
Understanding Generators in Python
How to Print Instances of a Class Using Print()