Why Are Python Strings and Tuples Are Made Immutable

Why are python strings and tuples are made immutable?

One is performance: knowing that a
string is immutable makes it easy to
lay it out at construction time —
fixed and unchanging storage
requirements. This is also one of the
reasons for the distinction between
tuples and lists. This also allows the
implementation to safely reuse string
objects. For example, the CPython
implemenation uses pre-allocated
objects for single-character strings,
and usually returns the original
string for string operations that
doesn’t change the content.

The other is that strings in Python
are considered as "elemental" as
numbers. No amount of activity will
change the value 8 to anything else,
and in Python, no amount of activity
will change the string “eight” to
anything else.

https://web.archive.org/web/20201031092707/http://effbot.org/pyfaq/why-are-python-strings-immutable.htm

What makes python tuples immutable. How are they implemented in the memory?

There's nothing special about the implementation or memory layout of tuples that makes them immutable. They're immutable because they just don't have any mutator operations.

Lists are mutable because the list class implements operations like append and __setitem__ that mutate lists. Such operations have to be deliberately included in the implementation; they don't come into existence automatically. If list didn't have mutator operations, lists would be immutable too.

At the level of the C implementation, the C code that implements tuples has to be able to write to a tuple's memory, and at that level, the data structures are mutable. The interface the implementation presents to Python code is immutable, though.

You can't do the same with a class you implement in Python because you can't get a firm separation between implementation and interface. Such a separation arises automatically when implementing a Python class in C due to the design of the Python C API, but without anything like a private access modifier, you can't do the same in Python.

Since Tuples are immutable, why does slicing them make a copy instead of a view?

By view, are you thinking of something equivalent to what numpy does? I'm familiar with how and why numpy does that.

A numpy array is an object with shape and dtype information, plus a data buffer. You can see this information in the __array_interface__ property. A view is a new numpy object, with its own shape attribute, but with a new data buffer pointer that points to someplace in the source buffer. It also has a flag that says "I don't own the buffer". numpy also maintains its own reference count, so the data buffer is not destroyed if the original (owner) array is deleted (and garbage collected).

This use of views can be big time saver, especially with very large arrays (questions about memory errors are common on SO). Views also allow different dtype, so a data buffer can be viewed at 4 byte integers, or 1 bytes characters, etc.

How would this apply to tuples? My guess is that it would require a lot of extra baggage. A tuple consists of a fixed set of object pointers - probably a C array. A view would use the same array, but with its own start and end markers (pointers and/or lengths). What about sharing flags? Garbage collection?

And what's the typical size and use of tuples? A common use of tuples is to pass arguments to a function. My guess is that a majority of tuples in a typical Python run are small - 0, 1 or 2 elements. Slices are allowed, but are they very common? On small tuples or very large ones?

Would there be any unintended consequences to making tuple slices views (in the numpy sense)? The distinction between views and copies is one of the harder things for numpy users to grasp. Since a tuple is supposed to be immutable - that is the pointers in the tuple cannot be changed - it is possible that implementing views would be invisible to users. But still I wonder.

It may make most sense to try this idea on a branch of the PyPy version - unless you really like to get dig into Cpython code. Or as a custom class with Cython.

Why tuple is not mutable in Python?

A few reasons:

  • Mutable objects like lists cannot be used as dictionary keys or set members in Python, since they are not hashable. If lists were given __hash__ methods based on their contents, the values returned could change as the contents change, which violates the contract for hash values.
  • If Python only had mutable sequences, constructors which accepted sequences would often need to copy them to ensure that the sequences couldn't be modified by other code. Constructors can avoid defensive copying by only accepting tuples. Better yet, they can pass sequence arguments through the tuple method which will copy only when necessary.

Why are there immutable objects in Python?

Your example is incorrect. l did not change outside the scope of foo(). i and l inside of foo() are new names that point to new objects.

Python 2.7.10 (default, Aug 22 2015, 20:33:39)
[GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.0.59.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> def foo(i, l):
... i = 5 # this creates a local name i that points to 5
... l = [1, 2] # this creates a local name l that points to [1, 2]
...
>>> i = 10
>>> l = [1, 2, 3]
>>> print(i)
10
>>> print(l)
[1, 2, 3]
>>> foo(i, l)
>>> print(i)
10
>>> print(l)
[1, 2, 3]

Now if you changed foo() to mutate l, that is a different story

>>> def foo(i, l):
... l.append(10)
...
>>> foo(i, l)
>>> print(l)
[1, 2, 3, 10]

Python 3 example is the same

Python 3.4.3 (v3.4.3:9b73f1c3e601, Feb 23 2015, 02:52:03)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> def foo(i, l):
... i = 5 # this creates a new variable which is also called i
... l = [1, 2] # this changes the existing variable called l
...
>>> i = 10
>>> l = [1, 2, 3]
>>> print(i)
10
>>> print(l)
[1, 2, 3]
>>> foo(i, l)
>>> print(i)
10
>>> print(l)
[1, 2, 3]

Tuple vs String vs frozenset. Immutable objects and the number of copies in memory

Your sentence "I've read that one of the reasons for this is because strings are immutable, so one copy in memory will be enough." is correct but it is not true all the times.
for example if you do the same with the string
"dgjudfigur89tyur9egjr9ivr89egre8frejf9reimfkldsmgoifsgjurt89igjkmrt0ivmkrt8g,rt89gjtrt"
It won't be the same object (at least on my python's version).
The same phenomenon can be replicated in integers, where 256 will be the same object but 257 won't.
It has to do with the way python caches objects, it saves "simple" objects. Each object has its criteria, for string it is only containing certains characters, for integers their range.

Why are integers immutable in Python?

Making integers mutable would be very counter-intuitive to the way we are used to working with them.

Consider this code fragment:

a = 1       # assign 1 to a
b = a+2 # assign 3 to b, leave a at 1

After these assignments are executed we expect a to have the value 1 and b to have the value 3. The addition operation is creating a new integer value from the integer stored in a and an instance of the integer 2.
If the addition operation just took the integer at a and just mutated it then both a and b would have the value 3.

So we expect arithmetic operations to create new values for their results - not to mutate their input parameters.

However, there are cases where mutating a data structure is more convenient and more efficient. Let's suppose for the moment that list.append(x) did not modify list but returned a new copy of list with x appended.
Then a function like this:

def foo():
nums = []
for x in range(0,10):
nums.append(x)
return nums

would just return the empty list. (Remember - here nums.append(x) doesn't alter nums - it returns a new list with x appended. But this new list isn't saved anywhere.)

We would have to write the foo routine like this:

def foo():
nums = []
for x in range(0,10):
nums = nums.append(x)
return nums

(This, in fact, is very similar to the situation with Python strings up until about 2.6 or perhaps 2.5.)

Moreover, every time we assign nums = nums.append(x) we would be copying a list that is increasing in size resulting in quadratic behavior.
For those reasons we make lists mutable objects.

A consequence to making lists mutable is that after these statements:

a = [1,2,3]
b = a
a.append(4)

the list b has changed to [1,2,3,4]. This is something that we live with even though it still trips us up now and then.



Related Topics



Leave a reply



Submit