What Does "List Comprehension" and Similar Mean? How Does It Work and How to Use It

What does list comprehension and similar mean? How does it work and how can I use it?

From the documentation:

List comprehensions provide a concise way to create lists. Common applications are to make new lists where each element is the result of some operations applied to each member of another sequence or iterable, or to create a subsequence of those elements that satisfy a certain condition.


About your question, the list comprehension does the same thing as the following "plain" Python code:

>>> l = [] 
>>> for x in range(10):
... l.append(x**2)
>>> l
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

How do you write it in one line? Hmm...we can...probably...use map() with lambda:

>>> list(map(lambda x: x**2, range(10)))
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

But isn't it clearer and simpler to just use a list comprehension?

>>> [x**2 for x in range(10)]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Basically, we can do anything with x. Not only x**2. For example, run a method of x:

>>> [x.strip() for x in ('foo\n', 'bar\n', 'baz\n')]
['foo', 'bar', 'baz']

Or use x as another function's argument:

>>> [int(x) for x in ('1', '2', '3')]
[1, 2, 3]

We can also, for example, use x as the key of a dict object. Let's see:

>>> d = {'foo': '10', 'bar': '20', 'baz': '30'}
>>> [d[x] for x in ['foo', 'baz']]
['10', '30']

How about a combination?

>>> d = {'foo': '10', 'bar': '20', 'baz': '30'}
>>> [int(d[x].rstrip('0')) for x in ['foo', 'baz']]
[1, 3]

And so on.


You can also use if or if...else in a list comprehension. For example, you only want odd numbers in range(10). You can do:

>>> l = []
>>> for x in range(10):
... if x%2:
... l.append(x)
>>> l
[1, 3, 5, 7, 9]

Ah that's too complex. What about the following version?

>>> [x for x in range(10) if x%2]
[1, 3, 5, 7, 9]

To use an if...else ternary expression, you need put the if ... else ... after x, not after range(10):

>>> [i if i%2 != 0 else None for i in range(10)]
[None, 1, None, 3, None, 5, None, 7, None, 9]

Have you heard about nested list comprehension? You can put two or more fors in one list comprehension. For example:

>>> [i for x in [[1, 2, 3], [4, 5, 6]] for i in x]
[1, 2, 3, 4, 5, 6]

>>> [j for x in [[[1, 2], [3]], [[4, 5], [6]]] for i in x for j in i]
[1, 2, 3, 4, 5, 6]

Let's talk about the first part, for x in [[1, 2, 3], [4, 5, 6]] which gives [1, 2, 3] and [4, 5, 6]. Then, for i in x gives 1, 2, 3 and 4, 5, 6.

Warning: You always need put for x in [[1, 2, 3], [4, 5, 6]] before for i in x:

>>> [j for j in x for x in [[1, 2, 3], [4, 5, 6]]]
Traceback (most recent call last):
File "<input>", line 1, in <module>
NameError: name 'x' is not defined

We also have set comprehensions, dict comprehensions, and generator expressions.

set comprehensions and list comprehensions are basically the same, but the former returns a set instead of a list:

>>> {x for x in [1, 1, 2, 3, 3, 1]}
{1, 2, 3}

It's the same as:

>>> set([i for i in [1, 1, 2, 3, 3, 1]])
{1, 2, 3}

A dict comprehension looks like a set comprehension, but it uses {key: value for key, value in ...} or {i: i for i in ...} instead of {i for i in ...}.

For example:

>>> {i: i**2 for i in range(5)}
{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}

And it equals:

>>> d = {}
>>> for i in range(5):
... d[i] = i**2
>>> d
{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}

Does (i for i in range(5)) give a tuple? No!, it's a generator expression. Which returns a generator:

>>> (i for i in range(5))
<generator object <genexpr> at 0x7f52703fbca8>

It's the same as:

>>> def gen():
... for i in range(5):
... yield i
>>> gen()
<generator object gen at 0x7f5270380db0>

And you can use it as a generator:

>>> gen = (i for i in range(5))
>>> next(gen)
0
>>> next(gen)
1
>>> list(gen)
[2, 3, 4]
>>> next(gen)
Traceback (most recent call last):
File "<input>", line 1, in <module>
StopIteration

Note: If you use a list comprehension inside a function, you don't need the [] if that function could loop over a generator. For example, sum():

>>> sum(i**2 for i in range(5))
30

Related (about generators): Understanding Generators in Python.

How does List Comprehension exactly work in Python?

Look at the actual bytecode that is produced. I've put the two fragments of code into fuctions called f1 and f2.

The comprehension does this:

  3          15 LOAD_CONST               3 (<code object <listcomp> at 0x7fbf6c1b59c0, file "<stdin>", line 3>)
18 LOAD_CONST 4 ('f1.<locals>.<listcomp>')
21 MAKE_FUNCTION 0
24 LOAD_FAST 0 (L)
27 GET_ITER
28 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
31 STORE_FAST 0 (L)

Notice there is no loop in the bytecode. The loop happens in C.

Now the for loop does this:

  4          21 SETUP_LOOP              31 (to 55)
24 LOAD_FAST 0 (L)
27 GET_ITER
>> 28 FOR_ITER 23 (to 54)
31 STORE_FAST 2 (x)
34 LOAD_FAST 1 (res)
37 LOAD_ATTR 1 (append)
40 LOAD_FAST 2 (x)
43 LOAD_CONST 3 (2)
46 BINARY_POWER
47 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
50 POP_TOP
51 JUMP_ABSOLUTE 28
>> 54 POP_BLOCK

In contrast to the comprehension, the loop is clearly here in the bytecode. So the loop occurs in python.

The bytecodes are different, and the first should be faster.

Python Loops and List Comprehension Connection

Here a documentation about List Comprehensions and Loops

Loops are used for iterating through Lists, Tuples and other Iterables

items = [1, 3, 6]
for item in items:
print(item)
> 1
> 3
> 6

List Comprehensions are used for creating new lists from another Iterable

items = [1, 3, 6]
double_items = [item * 2 for item in items]
print(double_items)
> [2, 6, 12]

You can also filter items with List Comprehensions like this

items = [1, 3, 6, 8]
even_items = [item for item in items if item % 2 == 0]
print(even_items)
> [6, 8]

Why results of map() and list comprehension are different?

They are different, because the value of i in both the generator expression and the list comp are evaluated lazily, i.e. when the anonymous functions are invoked in f.

By that time, i is bound to the last value if t, which is -1.

So basically, this is what the list comprehension does (likewise for the genexp):

x = []
i = 1 # 1. from t
x.append(lambda: i)
i = -1 # 2. from t
x.append(lambda: i)

Now the lambdas carry around a closure that references i, but i is bound to -1 in both cases, because that is the last value it was assigned to.

If you want to make sure that the lambda receives the current value of i, do

f(*[lambda u=i: u for i in t])

This way, you force the evaluation of i at the time the closure is created.

Edit: There is one difference between generator expressions and list comprehensions: the latter leak the loop variable into the surrounding scope.

Is list comprehension implemented via map and lambda function?

No, list comprehensions are not implemented by map and lambda under the hood, not in CPython and not in Pypy3 either.

CPython (3.9.13 here) compiles the list comprehension into a special code object that outputs a list and calls it as a function:

~ $ echo 'x = [a + 1 for a in [1, 2, 3, 4]]' | python3 -m dis
1 0 LOAD_CONST 0 (<code object <listcomp> at 0x107446f50, file "<stdin>", line 1>)
2 LOAD_CONST 1 ('<listcomp>')
4 MAKE_FUNCTION 0
6 LOAD_CONST 2 ((1, 2, 3, 4))
8 GET_ITER
10 CALL_FUNCTION 1
12 STORE_NAME 0 (x)
14 LOAD_CONST 3 (None)
16 RETURN_VALUE

Disassembly of <code object <listcomp> at 0x107446f50, file "<stdin>", line 1>:
1 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 12 (to 18)
6 STORE_FAST 1 (a)
8 LOAD_FAST 1 (a)
10 LOAD_CONST 0 (1)
12 BINARY_ADD
14 LIST_APPEND 2
16 JUMP_ABSOLUTE 4
>> 18 RETURN_VALUE

Whereas the equivalent list(map(lambda: ...)) thing is just function calls:

~ $ echo 'x = list(map(lambda a: a + 1, [1, 2, 3, 4]))' | python3 -m dis
1 0 LOAD_NAME 0 (list)
2 LOAD_NAME 1 (map)
4 LOAD_CONST 0 (<code object <lambda> at 0x102701f50, file "<stdin>", line 1>)
6 LOAD_CONST 1 ('<lambda>')
8 MAKE_FUNCTION 0
10 BUILD_LIST 0
12 LOAD_CONST 2 ((1, 2, 3, 4))
14 LIST_EXTEND 1
16 CALL_FUNCTION 2
18 CALL_FUNCTION 1
20 STORE_NAME 2 (x)
22 LOAD_CONST 3 (None)
24 RETURN_VALUE

Disassembly of <code object <lambda> at 0x102701f50, file "<stdin>", line 1>:
1 0 LOAD_FAST 0 (a)
2 LOAD_CONST 1 (1)
4 BINARY_ADD
6 RETURN_VALUE

breaking down list comprehension in python

Original expression with list comprehension:

units = dict((s, [u for u in unitlist if s in u]) for s in boxes)

Classical reproduction:

units = {}

for s in boxes:
values = []
for u in unitlist:
if s in u:
values.append(u)

units[s] = values

Your original expression says "make a dict() composed of key, value pairs with s as key and a sublist of unitlist as value"

Your expression also says there is a condition: although all s in boxes will be found as keys in your units dict, the associated values will be equal or a sublist of u.

How do 'for' loops in lists work in Python?

It's called List Comprehensions and it is basically a quick way to build a sequence. The code you demonstrate basically means,

for each x in my_list, perform x.split(","), and then put all the result in a new list, which is then passed to my_list.

It is equivalent to:

new_list = []

for x in my_list:
y = x.split(",")
new_list.append(y)

my_list = new_list

So you can see with list comprehensions it is a lot simpler.

Create a dictionary with comprehension

Use a dict comprehension (Python 2.7 and later):

{key: value for (key, value) in iterable}

Alternatively for simpler cases or earlier version of Python, use the dict constructor, e.g.:

pairs = [('a', 1), ('b', 2)]
dict(pairs) #=> {'a': 1, 'b': 2}
dict([(k, v+1) for k, v in pairs]) #=> {'a': 2, 'b': 3}

Given separate arrays of keys and values, use the dict constructor with zip:

keys = ['a', 'b']
values = [1, 2]
dict(zip(keys, values)) #=> {'a': 1, 'b': 2}
2) "zip'ped" from two separate iterables of keys/vals
dict(zip(list_of_keys, list_of_values))


Related Topics



Leave a reply



Submit