Does Python Have a Stack/Heap and How Is Memory Managed

Does Python have a stack/heap and how is memory managed?

How are variables and memory managed in Python.

Automagically! No, really, you just create an object and the Python Virtual Machine handles the memory needed and where it shall be placed in the memory layout.

Does it have a stack and a heap and what algorithm is used to manage
memory?

When we are talking about CPython it uses a private heap for storing objects. From the CPython C API documentation:

Memory management in Python involves a private heap containing all
Python objects and data structures. The management of this private
heap is ensured internally by the Python memory manager. The Python
memory manager has different components which deal with various
dynamic storage management aspects, like sharing, segmentation,
preallocation or caching.

Memory reclamation is mostly handled by reference counting. That is, the Python VM keeps an internal journal of how many references refer to an object, and automatically garbage collects it when there are no more references referring to it. In addition, there is a mechanism to break circular references (which reference counting can't handle) by detecting unreachable "islands" of objects, somewhat in reverse of traditional GC algorithms that try to find all the reachable objects.

NOTE: Please keep in mind that this information is CPython specific. Other python implementations, such as pypy, iron python, jython and others may differ from one another and from CPython when it comes to their implementation specifics. To understand that better, it may help to understand that there is a difference between Python the semantics (the language) and the underlying implementation

Given this knowledge are there any recommendations on memory management for large number/data crunching?

Now I can not speak about this, but I am sure that NumPy (the most popular python library for number crunching) has mechanisms that handle memory consumption gracefully.

If you would like to know more about Python's Internals take a look at these resources:

  • Stepping through CPython (video)
  • A presentation about the internals of the Python Virtual Machine
  • In true hacker spirit, the CPython Object Allocator source code

CPython - Internally, what is stored on the stack and heap?

All Python objects in the CPython implementation go on the heap. You can read in detail how Python's memory management works here in the documentation:

Memory management in Python involves a private heap containing all Python objects and data structures. The management of this private heap is ensured internally by the Python memory manager. The Python memory manager has different components which deal with various dynamic storage management aspects, like sharing, segmentation, preallocation or caching.

Note that Python itself is just a language, and says nothing about how internals like memory management should work; this is a detail left to implementers.

Where are the variables in Python stored?

Copying from Python documentation:

Memory management in Python involves a private heap containing all Python objects and data structures. The management of this private heap is ensured internally by the Python memory manager. The Python memory manager has different components which deal with various dynamic storage management aspects, like sharing, segmentation, preallocation or caching.

What and where are the stack and heap?

The stack is the memory set aside as scratch space for a thread of execution. When a function is called, a block is reserved on the top of the stack for local variables and some bookkeeping data. When that function returns, the block becomes unused and can be used the next time a function is called. The stack is always reserved in a LIFO (last in first out) order; the most recently reserved block is always the next block to be freed. This makes it really simple to keep track of the stack; freeing a block from the stack is nothing more than adjusting one pointer.

The heap is memory set aside for dynamic allocation. Unlike the stack, there's no enforced pattern to the allocation and deallocation of blocks from the heap; you can allocate a block at any time and free it at any time. This makes it much more complex to keep track of which parts of the heap are allocated or free at any given time; there are many custom heap allocators available to tune heap performance for different usage patterns.

Each thread gets a stack, while there's typically only one heap for the application (although it isn't uncommon to have multiple heaps for different types of allocation).

To answer your questions directly:

To what extent are they controlled by the OS or language runtime?

The OS allocates the stack for each system-level thread when the thread is created. Typically the OS is called by the language runtime to allocate the heap for the application.

What is their scope?

The stack is attached to a thread, so when the thread exits the stack is reclaimed. The heap is typically allocated at application startup by the runtime, and is reclaimed when the application (technically process) exits.

What determines the size of each of them?

The size of the stack is set when a thread is created. The size of the heap is set on application startup, but can grow as space is needed (the allocator requests more memory from the operating system).

What makes one faster?

The stack is faster because the access pattern makes it trivial to allocate and deallocate memory from it (a pointer/integer is simply incremented or decremented), while the heap has much more complex bookkeeping involved in an allocation or deallocation. Also, each byte in the stack tends to be reused very frequently which means it tends to be mapped to the processor's cache, making it very fast. Another performance hit for the heap is that the heap, being mostly a global resource, typically has to be multi-threading safe, i.e. each allocation and deallocation needs to be - typically - synchronized with "all" other heap accesses in the program.

A clear demonstration:
Sample Image

Image source: vikashazrati.wordpress.com

Does Python have static objects, stack objects and heap objects?

Python objects are mostly heap objects - however, there are some special PyObject singleton values in CPython that are static in C; though this is an implementation detail. For example the usual built-in types have static storage duration. There are no stack (Python) objects that I know of.

The static storage duration, as understood here, has absolutely nothing to do with static methods.

What is the stack in Python?

Oversimplifying slightly:

In CPython, when PyEval_EvalFrameEx is evaluating a Python stack frame's code, and comes to a direct function call, it allocates a new Python stack frame, links it up… and then recursively calls PyEval_EvalFrameEx on that new frame.

So, the C stack is a stack of recursive calls of the interpreter loop.

The Python stack is a stack of Python frame objects, implemented as a simple linked list of heap-allocated objects.

They're not completely unrelated, but they're not the same thing.

When you use generators, this gets slightly more confusing, because those Python stack frames can be unlinked and relinked in different places when they're resumed. Which is why the two stacks are separate. (See Ned's answer, which explains this better than I could.)



Related Topics



Leave a reply



Submit