Two Variables in Python Have Same Id, But Not Lists or Tuples

Two variables in Python have same id, but not lists or tuples

Immutable objects don't have the same id, and as a mater of fact this is not true for any type of objects that you define separately. Generally speaking, every time you define an object in Python, you'll create a new object with a new identity. However, for the sake of optimization (mostly) there are some exceptions for small integers (between -5 and 256) and interned strings, with a special length --usually less than 20 characters--* which are singletons and have the same id (actually one object with multiple pointers). You can check this like following:

>>> 30 is (20 + 10)
True
>>> 300 is (200 + 100)
False
>>> 'aa' * 2 is 'a' * 4
True
>>> 'aa' * 20 is 'a' * 40
False

And for a custom object:

>>> class A:
... pass
...
>>> A() is A() # Every time you create an instance you'll have a new instance with new identity
False

Also note that the is operator will check the object's identity, not the value. If you want to check the value you should use ==:

>>> 300 == 3*100
True

And since there is no such optimizational or interning rule for tuples or any mutable type for that matter, if you define two same tuples in any size they'll get their own identities, hence different objects:

>>> a = (1,)
>>> b = (1,)
>>>
>>> a is b
False

It's also worth mentioning that rules of "singleton integers" and "interned strings" are true even when they've been defined within an iterator.

>>> a = (100, 700, 400)
>>>
>>> b = (100, 700, 400)
>>>
>>> a[0] is b[0]
True
>>> a[1] is b[1]
False


* A good and detailed article on this: http://guilload.com/python-string-interning/

Why don't tuples get the same ID when assigned the same values?

You created new tuple objects. That they have the same contents doesn't mean that they'll be the exact same tuple objects in memory.

Immutability doesn't mean that creating the same value will create the same object. You never mutated the old (1, 2) tuples, and your new (1, 2) tuples are not mutable either.

CPython does keep a cache of re-usable tuple objects (so it doesn't have to create new objects all the time, Python goes through a lot of small tuples during a typical program), but that's an implementation detail you can't rely on. It is this cache that is the reason for the same ids being seen again, for tuples of length two. See How is tuple implemented in CPython? if you want to know how the cache is implemented.

Furthermore, in CPython, id() is the memory location of the object, and Python is free to re-use memory locations once old objects have been freed. This is clearly documented:

This is an integer which is guaranteed to be unique and constant for this object during its lifetime. Two objects with non-overlapping lifetimes may have the same id() value.

It is always possible to see the same id() value for new objects. Sometimes this means you still have the same object (as is the case for small integers or tuples or certain types of string object), sometimes it is just that the interpreter re-used the same location in memory. You should never rely on this, these are implementation details for performance purposes and subject to change.

Two variables with the same list have different IDs.....why is that?

Every distinct object in Python has its own ID. It's not related to the contents -- it's related to the location where the information that describes the object is stored. Any distinct object stored in a distinct place will have a distinct id. (It's sometimes, but not always, the memory address of the object.)

This is especially important to understand for mutable objects -- that is, objects that can be changed, like lists. If an object can be changed, then you can create two different objects with the same contents. They will have different IDs, and if you change one later, the second will not change.

For immutable objects like integers and strings, this is less important, because the contents can never change. Even if two immutable objects have different IDs, they are essentially identical if they have identical contents.

This set of ideas goes pretty deep. You can think of a variable name as a tag assigned to an ID number, which in turn uniquely identifies an object. Multiple variable names can be used to tag the same object. Observe:

>>> a = [1, 2, 3]
>>> b = [1, 2, 3]
>>> id(a)
4532949432
>>> id(b)
4533024888

That, you've already discovered. Now let's create a new variable name:

>>> c = b
>>> id(c)
4533024888

No new object has been created. The object tagged with b is now tagged with c as well. What happens when we change a?

>>> a[1] = 1000
>>> a
[1, 1000, 3]
>>> b
[1, 2, 3]

a and b are different, as we know because they have different IDs. So a change to one doesn't affect the other. But b and c are the same object -- remember? So...

>>> b[1] = 2000
>>> b
[1, 2000, 3]
>>> c
[1, 2000, 3]

Now, if I assign a new value to b, it doesn't change anything about the objects themselves -- just the way they're tagged:

>>> b = a
>>> a
[1, 1000, 3]
>>> b
[1, 1000, 3]
>>> c
[1, 2000, 3]

Why do variables have same id when they are passed by value in Python

I have been studying functions in python and how the variables are passed by values and not reference since its a lot safer

I don't know where you read this, but that's just complete bullshit.

if that is the way it is

It isn't.

then why do the variable passed around have the same ID.

Because they point to the same object, obviously.

I initially used x=3 but then I read that how python caches variables from -5 to 256

That's a CPython implementation detail, and the exact values depend on the CPython version etc. But anyway, you can test this with any number you want, and actually just any type, the result will still be the same.

so I used 500 but it still shows the same id. If they have the same doesn't it mean that it is the same object passed around?

id(obj), by definition, returns the object's unique identifier (actually the memory address in CPython but that's also an implementation detail), so by definition, if two objects have the same id, then they are indeed the very same object.

NB : "unique" meaning that for the current process, no other object will have the same id at the same time - once an object is garbage-collected, it's id can be reused.

FWIW, using a mutable object, it's quite easy to find out that it's not passed "by value":

def foo(lst):
lst.append(42)

answers = []
for i in range(10):
print("{} - before: {}".format(i, answers))
foo(answers)
print("{} - after: {}".format(i, answers))

0 - before: []
0 - after: [42]
1 - before: [42]
1 - after: [42, 42]
2 - before: [42, 42]
2 - after: [42, 42, 42]
3 - before: [42, 42, 42]
3 - after: [42, 42, 42, 42]
4 - before: [42, 42, 42, 42]
4 - after: [42, 42, 42, 42, 42]
5 - before: [42, 42, 42, 42, 42]
5 - after: [42, 42, 42, 42, 42, 42]
6 - before: [42, 42, 42, 42, 42, 42]
6 - after: [42, 42, 42, 42, 42, 42, 42]
7 - before: [42, 42, 42, 42, 42, 42, 42]
7 - after: [42, 42, 42, 42, 42, 42, 42, 42]
8 - before: [42, 42, 42, 42, 42, 42, 42, 42]
8 - after: [42, 42, 42, 42, 42, 42, 42, 42, 42]
9 - before: [42, 42, 42, 42, 42, 42, 42, 42, 42]
9 - after: [42, 42, 42, 42, 42, 42, 42, 42, 42, 42]

Why Python allocates new id to list, tuples, dict even though having same values?

Because otherwise this would happen:

x3 = [1,2,3]
y3 = [1,2,3]

x3[0] = "foo"
x3[0] == y3[0] # Does NOT happen!

In fact,

x3[0] != y3[0]

which is a Good Thing™. If x3 and y3 would be identical, changing one would change the other, too. That's generally not expected.

See also when does Python allocate new memory for identical strings? why the behaviour is different for strings.

Also, use == if you want to compare values.

Do multiple immutable objects having the same value point to a single object in memory?

...it always point...

In general yes, but it is not guaranteed. It is a form of Python internal optimization known as type kerning.

You should look at it like something that does not matter for immutables, something transparent for the language user. If the object has a value that cannot change, it does not matter what instance of the objects of that type (and with that value) you are reading. That is why you can live with having only one.

As for the tuples, note that the contained objects can change, only the tuple cannot (that is, change the number of its elements).

So for immutables you do not have to worry.

For mutables, you should be careful, not with Python internal optimizations but with the code you write. Because you can have many names referring to the same instance (that now can be changed through any one of these references) and one change will be reflected in all of them. This is more tricky when passing mutables as arguments, because far away code can change the object (what was passed was a copy of the reference to the object, not a copy of the object itself).

It is your responsability to manage things with mutables. You can create new instances with the same values (copies) or share the objects. You can even pass copies as arguments to protect yourself from unintended side effects of calls.

Why don't lists with the same values point to the same memory location in python?

Lists are mutable. You don't want changes to one ostensibly independent list modifying another that is coincidentally identical.

Strings, on the other hand, are immutable. You can't make changes to var1 that would affect var2, so it's possible to share the underlying object. Note that it is not guaranteed that two str literals produce the same object, though. It is implementation-dependent when and whether such caching occurs.



Related Topics



Leave a reply



Submit