Are Tuples More Efficient Than Lists in Python

Are tuples more efficient than lists in Python?

The dis module disassembles the byte code for a function and is useful to see the difference between tuples and lists.

In this case, you can see that accessing an element generates identical code, but that assigning a tuple is much faster than assigning a list.

>>> def a():
... x=[1,2,3,4,5]
... y=x[2]
...
>>> def b():
... x=(1,2,3,4,5)
... y=x[2]
...
>>> import dis
>>> dis.dis(a)
2 0 LOAD_CONST 1 (1)
3 LOAD_CONST 2 (2)
6 LOAD_CONST 3 (3)
9 LOAD_CONST 4 (4)
12 LOAD_CONST 5 (5)
15 BUILD_LIST 5
18 STORE_FAST 0 (x)

3 21 LOAD_FAST 0 (x)
24 LOAD_CONST 2 (2)
27 BINARY_SUBSCR
28 STORE_FAST 1 (y)
31 LOAD_CONST 0 (None)
34 RETURN_VALUE
>>> dis.dis(b)
2 0 LOAD_CONST 6 ((1, 2, 3, 4, 5))
3 STORE_FAST 0 (x)

3 6 LOAD_FAST 0 (x)
9 LOAD_CONST 2 (2)
12 BINARY_SUBSCR
13 STORE_FAST 1 (y)
16 LOAD_CONST 0 (None)
19 RETURN_VALUE

Why is tuple faster than list in Python?

The reported "speed of construction" ratio only holds for constant tuples (ones whose items are expressed by literals). Observe carefully (and repeat on your machine -- you just need to type the commands at a shell/command window!)...:

$ python3.1 -mtimeit -s'x,y,z=1,2,3' '[x,y,z]'
1000000 loops, best of 3: 0.379 usec per loop
$ python3.1 -mtimeit '[1,2,3]'
1000000 loops, best of 3: 0.413 usec per loop

$ python3.1 -mtimeit -s'x,y,z=1,2,3' '(x,y,z)'
10000000 loops, best of 3: 0.174 usec per loop
$ python3.1 -mtimeit '(1,2,3)'
10000000 loops, best of 3: 0.0602 usec per loop

$ python2.6 -mtimeit -s'x,y,z=1,2,3' '[x,y,z]'
1000000 loops, best of 3: 0.352 usec per loop
$ python2.6 -mtimeit '[1,2,3]'
1000000 loops, best of 3: 0.358 usec per loop

$ python2.6 -mtimeit -s'x,y,z=1,2,3' '(x,y,z)'
10000000 loops, best of 3: 0.157 usec per loop
$ python2.6 -mtimeit '(1,2,3)'
10000000 loops, best of 3: 0.0527 usec per loop

I didn't do the measurements on 3.0 because of course I don't have it around -- it's totally obsolete and there is absolutely no reason to keep it around, since 3.1 is superior to it in every way (Python 2.7, if you can upgrade to it, measures as being almost 20% faster than 2.6 in each task -- and 2.6, as you see, is faster than 3.1 -- so, if you care seriously about performance, Python 2.7 is really the only release you should be going for!).

Anyway, the key point here is that, in each Python release, building a list out of constant literals is about the same speed, or slightly slower, than building it out of values referenced by variables; but tuples behave very differently -- building a tuple out of constant literals is typically three times as fast as building it out of values referenced by variables! You may wonder how this can be, right?-)

Answer: a tuple made out of constant literals can easily be identified by the Python compiler as being one, immutable constant literal itself: so it's essentially built just once, when the compiler turns the source into bytecodes, and stashed away in the "constants table" of the relevant function or module. When those bytecodes execute, they just need to recover the pre-built constant tuple -- hey presto!-)

This easy optimization cannot be applied to lists, because a list is a mutable object, so it's crucial that, if the same expression such as [1, 2, 3] executes twice (in a loop -- the timeit module makes the loop on your behalf;-), a fresh new list object is constructed anew each time -- and that construction (like the construction of a tuple when the compiler cannot trivially identify it as a compile-time constant and immutable object) does take a little while.

That being said, tuple construction (when both constructions actually have to
occur) still is about twice as fast as list construction -- and that discrepancy can be explained by the tuple's sheer simplicity, which other answers have mentioned repeatedly. But, that simplicity does not account for a speedup of six times or more, as you observe if you only compare the construction of lists and tuples with simple constant literals as their items!_)

Why is single-value tuple plus list is more efficient than appending a number?

The creation of a tuple (1,) is optimized away by the compiler. On the other hand, the list is always created. Look at dis.dis

>>> import dis
>>> dis.dis('a.extend((1,))')
1 0 LOAD_NAME 0 (a)
2 LOAD_METHOD 1 (extend)
4 LOAD_CONST 0 ((1,))
6 CALL_METHOD 1
8 RETURN_VALUE
>>> dis.dis('a.extend([1])')
1 0 LOAD_NAME 0 (a)
2 LOAD_METHOD 1 (extend)
4 LOAD_CONST 0 (1)
6 BUILD_LIST 1
8 CALL_METHOD 1
10 RETURN_VALUE

Notice, it takes less byte-code instructions, and merely does a LOAD_CONST on (1,). On the other hand, for the list, BUILD_LIST is called (with a LOAD_CONST for 1).

Note, you can access these constants on the code object:

>>> code = compile('a.extend((1,))', '', 'eval')
>>> code
<code object <module> at 0x10e91e0e0, file "", line 1>
>>> code.co_consts
((1,),)

Finally, as to why += is faster than .extend, well, again if you look at the bytecode:

>>> dis.dis('a += b')
1 0 LOAD_NAME 0 (a)
2 LOAD_NAME 1 (b)
4 INPLACE_ADD
6 STORE_NAME 0 (a)
8 LOAD_CONST 0 (None)
10 RETURN_VALUE
>>> dis.dis('a.extend(b)')
1 0 LOAD_NAME 0 (a)
2 LOAD_METHOD 1 (extend)
4 LOAD_NAME 2 (b)
6 CALL_METHOD 1
8 RETURN_VALUE

You'll notice for .extend, it that requires first resolving the method (which takes extra time). Using the operator on the other hand has it's own bytecode: INPLACE_ADD so everything is pushed down into that C layer (plus, magic methods skip instance namespaces and a bunch of hooplah and are looked up directly on the class).

List vs tuple, when to use each?

There's a strong culture of tuples being for heterogeneous collections, similar to what you'd use structs for in C, and lists being for homogeneous collections, similar to what you'd use arrays for. But I've never quite squared this with the mutability issue mentioned in the other answers. Mutability has teeth to it (you actually can't change a tuple), while homogeneity is not enforced, and so seems to be a much less interesting distinction.

Are tuples faster than list because they are hashable?

No, being hashable has nothing to do with being faster.

As in Order to access an element from a collection that is hashable it requires constant time.

You're getting thing backward. The time to look up a hashable element in a collection that uses a hash table (like a set) is constant. But that's about the elements being hashable, not the collection, and it's about the collection using a hash table instead of an array, and it's about looking them up by value instead of by index.

Looking up a value in an array by index—whether the value or the array is hashable or not—takes constant time. Searching an array by value takes linear time. (Unless, e.g., it's sorted and you search by bisecting.)


Your teacher is only partly right—but then they may have been simplifying things to avoid getting into gory details.

There are three reasons why tuples are faster than lists for some operations.

But it's worth noting that these are usually pretty small differences, and usually hard to predict.1 Almost always, you just want to use whichever one makes more sense, and if you occasionally do find a bottleneck where a few % would make a difference, pull it out and timeit both versions and see.


First, there are some operations that are optimized differently for the two types. Of course this is different for different implementations and even different versions of the same implementation, but a few examples from CPython 3.7:

  • When sorting a list of tuples, there's a special unsafe_tuple_compare that isn't applied to lists.
  • When comparing two lists for == or !=, there's a special is test to short-circuit the comparison, which sometimes speeds things up a lot, but otherwise slows things down a little. Benchmarking a whole mess of code showed that this was worth doing for lists, but not for tuples.

Mutability generally doesn't enter into it for these choices; it's more about how the two types are typically used (lists are often homogenously-typed but arbitrary-length, while tuples are often heterogenerously-typed and consistent-length). However, it's not quite irrelevant—e.g., the fact that a list can be made to contain itself (because they're mutable) and a tuple can't (because they aren't) prevents at least one minor optimization from being applied to lists.2


Second, two equal tuple constants in the same compilation unit can be merged into the same value. And at least CPython and PyPy usually do so. Which can speed some things up (if nothing else, you get better cache locality when there's less data to cache, but sometimes it means bigger savings, like being able to use is tests).

And this one is about mutability: the compiler is only allowed to merge equal values if it knows they're immutable.


Third, lists of the same size are bigger. Allocating more memory, using more cache lines, etc. slows things down a little.

And this one is also about mutability. A list has to have room to grow on the end; otherwise, calling append N times would take N**2 time. But tuples don't have to append.


1. There are a handful of cases that come up often enough in certain kinds of problems that some people who deal with those problems all the time learn them and remember them. Occasionally, you'll see an answer on an optimization question on Stack Overflow where someone chimes in, "this would probably be about 3% faster with a tuple instead of a list", and they're usually right.

2. Also, I could imagine a case where a JIT compiler like the one in PyPy could speed things up better with a tuple. If you run the same code a million times in a row with the same values, you're going to get a million copies of the same answer—unless the value changes. If the value is a tuple of two objects, PyPy can add guards to see if either of those objects changes, and otherwise just reuse the last value. If it's a list of two objects, PyPy would have to add guards to the two objects and the list, which is 50% more checking. Whether this actually happens, I have no idea; every time I try to trace through how a PyPy optimizations works and generalize from there, I turn out to be wrong, and I just end up concluding that Armin Rigo is a wizard.

Why do tuples take less space in memory than lists?

I assume you're using CPython and with 64bits (I got the same results on my CPython 2.7 64-bit). There could be differences in other Python implementations or if you have a 32bit Python.

Regardless of the implementation, lists are variable-sized while tuples are fixed-size.

So tuples can store the elements directly inside the struct, lists on the other hand need a layer of indirection (it stores a pointer to the elements). This layer of indirection is a pointer, on 64bit systems that's 64bit, hence 8bytes.

But there's another thing that lists do: They over-allocate. Otherwise list.append would be an O(n) operation always - to make it amortized O(1) (much faster!!!) it over-allocates. But now it has to keep track of the allocated size and the filled size (tuples only need to store one size, because allocated and filled size are always identical). That means each list has to store another "size" which on 64bit systems is a 64bit integer, again 8 bytes.

So lists need at least 16 bytes more memory than tuples. Why did I say "at least"? Because of the over-allocation. Over-allocation means it allocates more space than needed. However, the amount of over-allocation depends on "how" you create the list and the append/deletion history:

>>> l = [1,2,3]
>>> l.__sizeof__()
64
>>> l.append(4) # triggers re-allocation (with over-allocation), because the original list is full
>>> l.__sizeof__()
96

>>> l = []
>>> l.__sizeof__()
40
>>> l.append(1) # re-allocation with over-allocation
>>> l.__sizeof__()
72
>>> l.append(2) # no re-alloc
>>> l.append(3) # no re-alloc
>>> l.__sizeof__()
72
>>> l.append(4) # still has room, so no over-allocation needed (yet)
>>> l.__sizeof__()
72

Images

I decided to create some images to accompany the explanation above. Maybe these are helpful

This is how it (schematically) is stored in memory in your example. I highlighted the differences with red (free-hand) cycles:

Sample Image

That's actually just an approximation because int objects are also Python objects and CPython even reuses small integers, so a probably more accurate representation (although not as readable) of the objects in memory would be:

Sample Image

Useful links:

  • tuple struct in CPython repository for Python 2.7
  • list struct in CPython repository for Python 2.7
  • int struct in CPython repository for Python 2.7

Note that __sizeof__ doesn't really return the "correct" size! It only returns the size of the stored values. However when you use sys.getsizeof the result is different:

>>> import sys
>>> l = [1,2,3]
>>> t = (1, 2, 3)
>>> sys.getsizeof(l)
88
>>> sys.getsizeof(t)
72

There are 24 "extra" bytes. These are real, that's the garbage collector overhead that isn't accounted for in the __sizeof__ method. That's because you're generally not supposed to use magic methods directly - use the functions that know how to handle them, in this case: sys.getsizeof (which actually adds the GC overhead to the value returned from __sizeof__).

What's the difference between lists and tuples?

Apart from tuples being immutable there is also a semantic distinction that should guide their usage. Tuples are heterogeneous data structures (i.e., their entries have different meanings), while lists are homogeneous sequences. Tuples have structure, lists have order.

Using this distinction makes code more explicit and understandable.

One example would be pairs of page and line number to reference locations in a book, e.g.:

my_location = (42, 11)  # page number, line number

You can then use this as a key in a dictionary to store notes on locations. A list on the other hand could be used to store multiple locations. Naturally one might want to add or remove locations from the list, so it makes sense that lists are mutable. On the other hand it doesn't make sense to add or remove items from an existing location - hence tuples are immutable.

There might be situations where you want to change items within an existing location tuple, for example when iterating through the lines of a page. But tuple immutability forces you to create a new location tuple for each new value. This seems inconvenient on the face of it, but using immutable data like this is a cornerstone of value types and functional programming techniques, which can have substantial advantages.

There are some interesting articles on this issue, e.g. "Python Tuples are Not Just Constant Lists" or "Understanding tuples vs. lists in Python". The official Python documentation also mentions this

"Tuples are immutable, and usually contain an heterogeneous sequence ...".

In a statically typed language like Haskell the values in a tuple generally have different types and the length of the tuple must be fixed. In a list the values all have the same type and the length is not fixed. So the difference is very obvious.

Finally there is the namedtuple in Python, which makes sense because a tuple is already supposed to have structure. This underlines the idea that tuples are a light-weight alternative to classes and instances.

List lookup faster than tuple?

Tuples are primarily faster for constructing lists, not for accessing them.

Tuples should be slightly faster to access: they require one less indirection. However, I believe the main benefit is that they don't require a second allocation when constructing the list.

The reason lists are slightly faster for lookups is because the Python engine has a special optimization for it:

case BINARY_SUBSCR:
w = POP();
v = TOP();
if (PyList_CheckExact(v) && PyInt_CheckExact(w)) {
/* INLINE: list[int] */
Py_ssize_t i = PyInt_AsSsize_t(w);
if (i < 0)
i += PyList_GET_SIZE(v);
if (i >= 0 && i < PyList_GET_SIZE(v)) {
x = PyList_GET_ITEM(v, i);
Py_INCREF(x);
}

With this optimization commented out, tuples are very slightly faster than lists (by about 4%).

Note that adding a separate special-case optimization for tuples here isn't necessary a good idea. Every special case like this in the main body of the VM loop increases the code size, which decreases cache consistency, and it means every other type of lookup requires an extra branch.

Speed test among set, list and tuple in Python gives surprising results

There are many problems with this test suite.

[range(1000)] isn't equivalent to the other two declarations. This makes a list with one element, and that single element points to the range. getsizeof is not recursive, so it only gives the size of the outer object. This can be illustrated by creating lists of a single element of different range sizes and noticing that the getsizeof call always gives the same result.

If you use list(range(1000)), you'll get a reasonable result that's about the same size as a tuple. I'm not sure what information is gained in making these lists 1000, though--why not compare sizes of the three empty elements? Even here, this is basically an implementation-dependent test that doesn't really have much bearing on how you might write a Python program, as far as I can tell. Sure, set is a bit larger as you'd expect (hash-based structures generally have more overhead than lists) and that varies from version to version.

As for the "create" test, consider set(i for i in range(1000)). In no way does this test the time it takes to create a set because most of the time is spent creating and running the generator that's being passed as a parameter. As with the last test, I'm not sure what this proves even if the test were fair (i.e. you used the list() initializer instead of a list comprehension). As with the "size" tests, you can just call the initializers with empty lists, but all this shows is that creation times are practically the same: there's some function call and object allocation overhead which is implementation-specific.

Lastly, the speed tests for lookup operations:

for i in Set_Test:
if i == 6:
break

This hardcodes a best-case scenario. List and tuple lookups perform a linear search, so the target is always found in 7 comparisons. This is going to be nearly identical to a set lookup, which is O(1) but requires some complicated hashing operations that add overhead. This test should use random indices where the distribution means the lists and tuples will be hit with worst-case scenarios regularly. The set should outperform lists and tuples done correctly.

Furthermore, a statement like "a set is faster than a list" has almost no meaning--data structures can't be compared like this, and a word like "faster" speaks to specific run-time conditions that are highly variable and case-specific. Comparing the data structures requires comparison of their operations' time complexities which help describe their fitness for some purpose.

To summarize: the first two tests aren't worth performing even if written correctly and the last test will show that set lookups are O(1) while list/tuple lookups are O(n) if they're made fair using random target items.



Related Topics



Leave a reply



Submit