Why Is There No Tuple Comprehension in Python

Why is there no tuple comprehension in Python?

You can use a generator expression:

tuple(i for i in (1, 2, 3))

but parentheses were already taken for … generator expressions.

Tuple comprehension is generating unexpected output

Tuple comprehension is not a thing. Check this SO post to understand why.

Do this instead:

x = tuple(y for y in range(100) if y%2==0 and y%5==0)

This uses the tuple() constructor to make a tuple from the generator object. You can use this constructor to make a tuple from elements yielded by an iterator or iterable object. So, this code works because generator objects can be used as iterators.

Are Python3.5 tuple comprehension really this limited?

TLDR: If you want a tuple, pass a generator expression to tuple:

{idx: tuple(x for x in range(5)) for idx in range(5)}

There are no "tuple comprehensions" in Python. This:

x for x in range(5)

is a generator expression. Adding parentheses around it is merely used to separate it from other elements. This is the same as in (a + b) * c, which does not involve a tuple either.

The * symbol is for iterator packing/unpacking. A generator expression happens to be an iterable, so it can be unpacked. However, there must be something to unpack the iterable into. For example, one can also unpack a list into the elements of an assignment:

*[1, 2]                         # illegal - nothing to unpack into
a, b, c, d = *[1, 2], 3, 4 # legal - unpack into assignment tuple

Now, doing *<iterable>, combines * unpacking with a , tuple literal. This is not useable in all situations, though - separating elements may take precedence over creating a tuple. For example, the last , in [*(1, 2), 3] separates, whereas in [(*(1, 2), 3)] it creates a tuple.

In a dictionary the , is ambiguous since it is used to separate elements. Compare {1: 1, 2: 2} and note that {1: 2,3} is illegal. For a return statement, it might be possible in the future.

If you want a tuple, you should use () whenever there might be ambiguity - even if Python can handle it, it is difficult to parse for humans otherwise.

When your source is a large statement such as a generator expression, I suggest to convert to a tuple explicitly. Compare the following two valid versions of your code for readability:

{idx: tuple(x for x in range(5)) for idx in range(5)}
{idx: (*(x for x in range(5)),) for idx in range(5)}

Note that list and dict comprehensions also work similar - they are practically like passing a generator expression to list, set or dict. They mostly serve to avoid looking up list, set or dict in the global namespace.


I feel like this is a bit more of a problem since comprehsions can be important for performance in some situations.

Under the covers, both generator expressions and list/dict/set comprehensions create a short-lived function. You should not rely on comprehensions for performance optimisation unless you have profiled and tested them. By default, use whatever is most readable for your use case.

dis.dis("""[a for a in (1, 2, 3)]""")
1 0 LOAD_CONST 0 (<code object <listcomp> at 0x10f730ed0, file "<dis>", line 1>)
2 LOAD_CONST 1 ('<listcomp>')
4 MAKE_FUNCTION 0
6 LOAD_CONST 5 ((1, 2, 3))
8 GET_ITER
10 CALL_FUNCTION 1
12 RETURN_VALUE

Can I turn a generator object into a tuple without using tuple() ?

In python 3 you can unpack a generator using *.

Here is an example:

>>> *(i+1 for i in (1,2,3)),
(2, 3, 4)

Tuple comprehensions and the star splat/unpack operator *

To me, it seems like the second example is also one where a generator
object is created first. Is this correct?

Yes, you're correct, checkout the CPython bytecode:

>>> import dis
>>> dis.dis("*(thing for thing in thing),")
1 0 LOAD_CONST 0 (<code object <genexpr> at 0x7f56e9347ed0, file "<dis>", line 1>)
2 LOAD_CONST 1 ('<genexpr>')
4 MAKE_FUNCTION 0
6 LOAD_NAME 0 (thing)
8 GET_ITER
10 CALL_FUNCTION 1
12 BUILD_TUPLE_UNPACK 1
14 POP_TOP
16 LOAD_CONST 2 (None)
18 RETURN_VALUE

Is there any difference between these expressions in terms of what
goes on behind the scenes? In terms of performance? I assume the first
and third could have latency issues while the second could have memory
issues (as is discussed in the linked comments).

My timings suggest the first 1 is slightly faster, presumably because the unpacking is more expensive via BUILD_TUPLE_UNPACK than the tuple() call:

>>> from timeit import timeit
>>> def f1(): tuple(thing for thing in range(100000))
...
>>> def f2(): *(thing for thing in range(100000)),
...
>>> timeit(lambda: f1(), number=100)
0.5535585517063737
>>> timeit(lambda: f2(), number=100)
0.6043887557461858

Comparing the first one and the last, which one is more pythonic?

The first one seems far more readable to me, and also will work across different Python versions.

Why is this if/else list comprehension not working?

Following the generalization you wrote in the question, your list comprehension should be:

c = [bool(z) if z is True or z is False or z == 'TRUE' or z == 'FALSE' else z for z in x]

with output

[True, False, True, True, True, 'VOID']


Related Topics



Leave a reply



Submit