Releasing Memory in Python

Releasing memory in Python

Memory allocated on the heap can be subject to high-water marks. This is complicated by Python's internal optimizations for allocating small objects (PyObject_Malloc) in 4 KiB pools, classed for allocation sizes at multiples of 8 bytes -- up to 256 bytes (512 bytes in 3.3). The pools themselves are in 256 KiB arenas, so if just one block in one pool is used, the entire 256 KiB arena will not be released. In Python 3.3 the small object allocator was switched to using anonymous memory maps instead of the heap, so it should perform better at releasing memory.

Additionally, the built-in types maintain freelists of previously allocated objects that may or may not use the small object allocator. The int type maintains a freelist with its own allocated memory, and clearing it requires calling PyInt_ClearFreeList(). This can be called indirectly by doing a full gc.collect.

Try it like this, and tell me what you get. Here's the link for psutil.Process.memory_info.

import os
import gc
import psutil

proc = psutil.Process(os.getpid())
gc.collect()
mem0 = proc.memory_info().rss

# create approx. 10**7 int objects and pointers
foo = ['abc' for x in range(10**7)]
mem1 = proc.memory_info().rss

# unreference, including x == 9999999
del foo, x
mem2 = proc.memory_info().rss

# collect() calls PyInt_ClearFreeList()
# or use ctypes: pythonapi.PyInt_ClearFreeList()
gc.collect()
mem3 = proc.memory_info().rss

pd = lambda x2, x1: 100.0 * (x2 - x1) / mem0
print "Allocation: %0.2f%%" % pd(mem1, mem0)
print "Unreference: %0.2f%%" % pd(mem2, mem1)
print "Collect: %0.2f%%" % pd(mem3, mem2)
print "Overall: %0.2f%%" % pd(mem3, mem0)

Output:

Allocation: 3034.36%
Unreference: -752.39%
Collect: -2279.74%
Overall: 2.23%

Edit:

I switched to measuring relative to the process VM size to eliminate the effects of other processes in the system.

The C runtime (e.g. glibc, msvcrt) shrinks the heap when contiguous free space at the top reaches a constant, dynamic, or configurable threshold. With glibc you can tune this with mallopt (M_TRIM_THRESHOLD). Given this, it isn't surprising if the heap shrinks by more -- even a lot more -- than the block that you free.

In 3.x range doesn't create a list, so the test above won't create 10 million int objects. Even if it did, the int type in 3.x is basically a 2.x long, which doesn't implement a freelist.

How can I explicitly free memory in Python?

According to Python Official Documentation, you can explicitly invoke the Garbage Collector to release unreferenced memory with gc.collect(). Example:

import gc

gc.collect()

You should do that after marking what you want to discard using del:

del my_array
del my_object
gc.collect()

Python not releasing memory in for loop

Delete all the temporary variables before calling gc.collect(), so that the data will become garbage immediately.

del start, stop, val_i, dist_i, index_i, dist_i, pred_i
gc.collect()

In your code, when you call gc.collect() the first time none of the data is garbage, because it can still be referenced from all the variables. The data from the first iteration won't be collected until the end of the second iteration; during each iteration after the first, you'll have two chunks of data in memory (the current iteration and the previous iteration). So you're using twice as much memory as you need (I assume there are references between some of the objects, so automatic GC is not cleaning up objects as the variables are reassigned during the loop).

how to release used memory immediately in python list?

def release_list(a):
del a[:]
del a

Do not ever do this. Python automatically frees all objects that are not referenced any more, so a simple del a ensures that the list's memory will be released if the list isn't referenced anywhere else. If that's the case, then the individual list items will also be released (and any objects referenced only from them, and so on and so on), unless some of the individual items were also still referenced.

That means the only time when del a[:]; del a will release more than del a on its own is when the list is referenced somewhere else. This is precisely when you shouldn't be emptying out the list: someone else is still using it!!!

Basically, you shouldn't be thinking about managing pieces of memory. Instead, think about managing references to objects. In 99% of all Python code, Python cleans up everything you don't need pretty soon after the last time you needed it, and there's no problem. Every time a function finishes all the local variables in that function "die", and if they were pointing to objects that are not referenced anywhere else they'll be deleted, and that will cascade to everything contained within those objects.

The only time you need to think about it is when you have a large object (say a huge list), you do something with it, and then you begin a long-running (or memory intensive) sub-computation, where the large object isn't needed for the sub-computation. Because you have a reference to it, the large object won't be released until the sub-computation finishes and then you return. In that sort of case (and only that sort of case), you can explicitly del your reference to the large object before you begin the sub-computation, so that the large object can be freed earlier (if no-one else is using it; if a caller passed the object in to you and the caller does still need it after you return, you'll be very glad that it doesn't get released).



Related Topics



Leave a reply



Submit