Why Does Id({}) == Id({}) and Id([]) == Id([]) in Cpython

Why does id({}) == id({}) and id([]) == id([]) in CPython?

CPython is garbage collecting objects as soon as they go out of scope, so the second [] is created after the first [] is collected. So, most of the time it ends up in the same memory location.

This shows what's happening very clearly (the output is likely to be different in other implementations of Python):

class A:
def __init__(self): print("a")
def __del__(self): print("b")

# a a b b False
print(A() is A())
# a b a b True
print(id(A()) == id(A()))

id() python difference between the == and 'is'

You are seeing the limit of Python's integer interning here. The CPython implementation keeps a pool of small int objects (-5 through 257) in memory and reuses them as much as possible. That is why id(a) and id(1) return the same value; both a and the literal 1 refer to the same object. That value, though, is a much larger integer (namely 1844525312). That means that Python is free to (and does) allocate separate int objects for the return value of id(a) and id(1), leading to the result you see. 1844525312 == 1844525312 is true, but id(a) and id(1) each return separate objects that represent the same value, leading to id(a) is id(1) returning false.

Note that with a = 1; id(a) == id(1) is not guaranteed to be true by Python itself; it's an implementation detail of a particular Python interpreter. It's allowed for an implementation to always allocate a new object for each new use, and it is allowed for an implementation to always reuse an existing object where possible. The only time Python guarantees that id(a) == id(b) for separate names a and b is if one name is assigned directly to the other (b = a or a = b).

id() vs `is` operator. Is it safe to compare `id`s? Does the same `id` mean the same object?

According to the id() documentation, an id is only guaranteed to be unique

  1. for the lifetime of the specific object, and
  2. within a specific interpreter instance

As such, comparing ids is not safe unless you also somehow ensure that both objects whose ids are taken are still alive at the time of comparison (and are associated with the same Python interpreter instance, but you need to really try to make that become false).

Which is exactly what is does -- which makes comparing ids redundant. If you cannot use the is syntax for whatever reason, there's always operator.is_.


Now, whether an object is still alive at the time of comparison is not always obvious (and sometimes is grossly non-obvious):

  • Accessing some attributes (e.g. bound methods of an object) creates a new object each time. So, the result's id may or may not be the same on each attribute access.

    Example:

    >>> class C(object): pass
    >>> c=C()
    >>> c.a=1

    >>> c.a is c.a
    True # same object each time

    >>> c.__init__ is c.__init__
    False # a different object each time

    # The above two are not the only possible cases.
    # An attribute may be implemented to sometimes return the same object
    # and sometimes a different one:
    @property
    def page(self):
    if check_for_new_version():
    self._page=get_new_version()
    return self._page
  • If an object is created as a result of calculating an expression and not saved anywhere, it's immediately discarded,1 and any object created after that can take up its id.

    • This is even true within the same code line. E.g. the result of id(create_foo()) == id(create_bar()) is undefined.

      Example:

      >>> id([])     #the list object is discarded when id() returns
      39733320L
      >>> id([]) #a new, unrelated object is created (and discarded, too)
      39733320L #its id can happen to be the same
      >>> id([[]])
      39733640L #or not
      >>> id([])
      39733640L #you never really know

Due to the above safety requirements when comparing ids, saving an id instead of the object is not very useful because you have to save a reference to the object itself anyway -- to ensure that it stays alive. Neither is there any performance gain: is implementation is as simple as comparing pointers.


Finally, as an internal optimization (and implementation detail, so this may differ between implementations and releases), CPython reuses some often-used simple objects of immutable types. As of this writing, that includes small integers and some strings. So even if you got them from different places, their ids might coincide.

This does not (technically) violate the above id() documentation's uniqueness promises: the reused object stays alive through all the reuses.

This is also not a big deal because whether two variables point to the same object or not is only practical to know if the object is mutable: if two variables point to the same mutable object, mutating one will (unexpectedly) change the other, too. Immutable types don't have that problem, so for them, it doesn't matter if two variables point to two identical objects or to the same one.


1Sometimes, this is called "unnamed expression".

What is the id( ) function used for?

Your post asks several questions:

What is the number returned from the function?

It is "an integer (or long integer) which is guaranteed to be unique and constant for this object during its lifetime." (Python Standard Library - Built-in Functions) A unique number. Nothing more, and nothing less. Think of it as a social-security number or employee id number for Python objects.

Is it the same with memory addresses in C?

Conceptually, yes, in that they are both guaranteed to be unique in their universe during their lifetime. And in one particular implementation of Python, it actually is the memory address of the corresponding C object.

If yes, why doesn't the number increase instantly by the size of the data type (I assume that it would be int)?

Because a list is not an array, and a list element is a reference, not an object.

When do we really use id( ) function?

Hardly ever. You can test if two references are the same by comparing their ids, but the is operator has always been the recommended way of doing that. id( ) is only really useful in debugging situations.

python: why is id(x) != id(y) when x and y are lists with equal values?

the fact that a and b integers have the same id is just a storage optimization performed by python on immutable objects (which cannot be relied upon, ex: if the numbers are big enough, ids can be different)

Try to change the value of b and you'll see that id(b) changes.

Of course, it's different for lists: cannot benefit from storage optimization since they're mutable: you don't want x to be changed when you change y.

python id() function implementation

There are multiple implementations of python. In cpython, all objects have a standard header and the id is the memory address of that header. References to objects are C pointers to their object header (that same memory address that is the id). You can't use a dunder method to find an object because you need the object pointer to find the dunder methods.

Python is compiled into byte code and that byte code is executed by C. When you call a function like id, that function can be more byte code, but it can also be a C function. Search for "builtin_id" in bltinmodule.c and you'll see the C implementation of id(some_object).

static PyObject *
builtin_id(PyModuleDef *self, PyObject *v)
/*[clinic end generated code: output=0aa640785f697f65 input=5a534136419631f4]*/
{
PyObject *id = PyLong_FromVoidPtr(v);

if (id && PySys_Audit("builtins.id", "O", id) < 0) {
Py_DECREF(id);
return NULL;
}

return id;
}

The id function is called with PyObject *v, a pointer to the object whose id should be taken. PyObject is the standard object header used by all python objects. It includes information needed to figure out what type the object really is. The id function turns the object pointer into a python integer with PyLong_FromVoidPtr (the name "long" for a python int is somewhat historical). That's the id you see at the python level.

You can get the cpython source on github and you can read up on C in the python docs at Extending and Embedding the Python Interpreter and Python/C API Reference Manual



Related Topics



Leave a reply



Submit