When should iteritems() be used instead of items()?
In Python 2.x - .items()
returned a list of (key, value) pairs. In Python 3.x, .items()
is now an itemview
object, which behaves differently - so it has to be iterated over, or materialised... So, list(dict.items())
is required for what was dict.items()
in Python 2.x.
Python 2.7 also has a bit of a back-port for key handling, in that you have viewkeys
, viewitems
and viewvalues
methods, the most useful being viewkeys
which behaves more like a set
(which you'd expect from a dict
).
Simple example:
common_keys = list(dict_a.viewkeys() & dict_b.viewkeys())
Will give you a list of the common keys, but again, in Python 3.x - just use .keys()
instead.
Python 3.x has generally been made to be more "lazy" - i.e. map
is now effectively itertools.imap
, zip
is itertools.izip
, etc.
What is the difference between dict.items() and dict.iteritems() in Python2?
It's part of an evolution.
Originally, Python items()
built a real list of tuples and returned that. That could potentially take a lot of extra memory.
Then, generators were introduced to the language in general, and that method was reimplemented as an iterator-generator method named iteritems()
. The original remains for backwards compatibility.
One of Python 3’s changes is that items()
now return views, and a list
is never fully built. The iteritems()
method is also gone, since items()
in Python 3 works like viewitems()
in Python 2.7.
Error: 'dict' object has no attribute 'iteritems'
As you are in python3 , use dict.items()
instead of dict.iteritems()
iteritems()
was removed in python3, so you can't use this method anymore.
Take a look at Python 3.0 Wiki Built-in Changes section, where it is stated:
Removed
dict.iteritems()
,dict.iterkeys()
, anddict.itervalues()
.Instead: use
dict.items()
,dict.keys()
, anddict.values()
respectively.
What is the advantage of iteritems?
To answer your question we should first dig some information about how and when iteritems()
was added to the API.
The iteritems()
method
was added in Python2.2 following the introduction of iterators and generators in the language (see also:
What is the difference between dict.items() and dict.iteritems()?). In fact the method is explicitly mentioned in PEP 234. So it was introduced as a lazy alternative to the already present items()
.
This followed the same pattern as file.xreadlines()
versus file.readlines()
which was introduced in Python 2.1 (and already deprecated in python2.3 by the way).
In python 2.3 the itertools
module was added which introduced lazy counterparts to map
, filter
etc.
In other words, at the time there was (and still there is) a strong trend towards lazyness of operations. One of the reasons is to improve memory efficiency. An other one is to avoid unneeded computation.
I cannot find any reference that says that it was introduced to improve the speed of looping over the dictionary. It was simply used to replace calls to items()
that didn't actually have to return a list. Note that this include more use-cases than just a simple for
loop.
For example in the code:
function(dictionary.iteritems())
you cannot simply use a for
loop to replace iteritems()
as in your example. You'd have to write a function (or use a genexp, even though they weren't available when iteritems()
was introduced, and they wouldn't be DRY...).
Retrieving the items from a dict
is done pretty often so it does make sense to provide a built-in method and, in fact, there was one: items()
. The problem with items()
is that:
- it isn't lazy, meaning that calling it on a big
dict
can take quite some time - it takes a lot of memory. It can almost double the memory usage of a program if called on a very big
dict
that contains most objects being manipulated - Most of the time it is iterated only once
So, when introducing iterators and generators, it was obvious to just add a lazy counterpart. If you need a list of items because you want to index it or iterate more than once, use items()
, otherwise you can just use iteritems()
and avoid the problems cited above.
The advantages of using iteritems()
are the same as using items()
versus manually getting the value:
- You write less code, which makes it more DRY and reduces the chances of errors
- Code is more readable.
Plus the advantages of lazyness.
As I already stated I cannot reproduce your performance results. On my machine iteritems()
is always faster than iterating + looking up by key. The difference is quite negligible anyway, and it's probably due to how the OS is handling caching and memory in general. In otherwords your argument about efficiency isn't a strong argument against (nor pro) using one or the other alternative.
Given equal performances on average, use the most readable and concise alternative: iteritems()
. This discussion would be similar to asking "why use a foreach when you can just loop by index with the same performance?". The importance of foreach isn't in the fact that you iterate faster but that you avoid writing boiler-plate code and improve readability.
I'd like to point out that iteritems()
was in fact removed in python3. This was part of the "cleanup" of this version. Python3 items()
method id (mostly) equivalent to Python2's viewitems()
method (actually a backport if I'm not mistaken...).
This version is lazy (and thus provides a replacement for iteritems()
) and has also further functionality, such as providing "set-like" operations (such as finding common items between dict
s in an efficient way etc.) So in python3 the reasons to use items()
instead of manually retrieving the values are even more compelling.
Python 2 and 3 compatible way of iterating through dict with key and value
You can simply use dict.items()
in both Python 2 and 3,
foo = [key for key, value in some_dict.items() if value['marked']]
Or you can simply roll your own version of items
generator, like this
def get_items(dict_object):
for key in dict_object:
yield key, dict_object[key]
And then use it like this
for key, value in get_items({1: 2, 3: 4}):
print key, value
Difference between normal list and dict.items()
In Python 3, dict.items()
(and also .keys()
and .values()
) returns a special dictionary view object. It behaves like an iterator, but isn't specifically a list.
#!/usr/bin/env python3
d = {}
d['a'] = 1
d['b'] = 2
# You can pack items() into a list and then it's a "real" list
l = list(d.items())
print(repr(l[1]))
# Or you can use itertools or otherwise use it as a plain iterator
import itertools
for p in itertools.islice(d.items(), 1, 2):
print(repr(p))
Using .iteritems() to iterate over key, value in Python dictionary
The other answer explains it well. But here are some further illustrations for how it behaves, by showing cases where it actually works without error (so you can see something):
>>> d = {(1,2): 3, (4,5): 6}
>>> for k, v in d:
print k, v
1 2
4 5
The loop goes through the keys (1,2)
and (4,5)
and since those "happen to be" tuples of size 2, they can be assigned to k
and v
.
Works with strings as well, as long as they have exactly two characters:
>>> d = {"AB":3, "CD":6}
>>> for k, v in d:
print k, v
A B
C D
I assume in your case it was something like this?
>>> d = {"ABC":3, "CD":6}
>>> for k, v in d:
print k, v
Traceback (most recent call last):
File "<pyshell#42>", line 1, in <module>
for k, v in d:
ValueError: too many values to unpack
Here, the key "ABC" is a triple and thus Python complains about trying to unpack it into just two variables.
Python 2.7 intentional use of .items over .iteritems
I will guess myself that this was simply a bug that went unnoticed. The Counter
class seems to have been hastily created.
eg. Counter
doesn't contain many inplace methods such as __iadd__
and __isub__
.
Which make it inefficient to continuously use
c = Counter()
for other in list_of_counters:
c += other
Nonetheless, these were added in Python 3. There doesn't seem to be any reason why they shouldn't have been implemented originally, probably for the same reason as this.
What is the difference betweend pandas.Series.items() and pandas.Series.iteritems()?
Series.iteritems()
just calls Series.items()
under the hood, see source code below:
def iteritems(self) -> Iterable[tuple[Hashable, Any]]:
return self.items()
Pandas Source
As a result, you should be fine to use either, although it appears Series.items()
is preferred.
Why does Python 3 need dict.items to be wrapped with list()?
You can safely ignore this "extra precautions" warning: your code will work the same even without list
in both versions of Python. It would run differently if you needed a list (but this is not the case): in fact, features.items()
is a list in Python 2, but a view in Python 3. They work the same when used as an iterable, as in your example.
Now, the Python 2 to Python 3 conversion tool 2to3
errs on the side of safety, and assumes that you really wanted a list when you use dict.items()
. This may not be the case (as in the question), in which case dict.items()
in Python 3 (no wrapping list
) is better (faster, and less memory-consuming, since no list is built).
Concretely, this means that Python 2 code can explicitly iterate over the view: for k, v in features.viewitems()
(which will be converted in Python 3 by 2to3
to features.items()
). It looks like your IDE thinks that the code is Python 2, because your for
statement is very good, in Python 3, so there should be no warning about Python 3 support.
Related Topics
How to Write Strategy Pattern in Python Differently Than Example in Wikipedia
When to Close Cursors Using MySQLdb
How to Read a Column of CSV as Dtype List Using Pandas
Multithreaded Web Server in Python
Python Command Line Input in a Process
Unbalanced Data and Weighted Cross Entropy
Using Numpy Vectorize on Functions That Return Vectors
Why Does Pandas Apply Calculate Twice
Print List of Lists in Separate Lines
Inserting the Same Value Multiple Times When Formatting a String
String Similarity Metrics in Python
Format Strings VS Concatenation
Running Bash Script from Within Python
Passing a Data Frame Column and External List to Udf Under Withcolumn
Getting List of Pixel Values from Pil