Python Memory Leaks

Python memory leaks

Have a look at this article: Tracing python memory leaks

Also, note that the garbage collection module actually can have debug flags set. Look at the set_debug function. Additionally, look at this code by Gnibbler for determining the types of objects that have been created after a call.

What is the best way to detect memory leaks in a PyQt application?

Memory Profiling Using tracemalloc

tracemalloc is a package included in the Python standard library.

It provides detailed, block-level traces of memory allocation, including the full traceback to the line where the memory allocation occurred, and statistics for the overall memory behavior of a program.

tracemalloc can be used to locate high-memory-usage areas of code in two ways:

  • looking at cumulative statistics on memory use to identify which object allocations are using the most memory, and
  • tracing execution frames to identify where those objects are allocated in the code.

A link for the documentation and PEP are below. Both provide excellent instructions on how to detect anomalies in Pythons memory management.

  • Tracemalloc Documentation Here

  • PEP454

Under what circumstances could an issue that looks like a python memory leak not be a leak?

In our case the cause of what looked like a leak was our python code consuming RAM faster than the python garbage collector was willing to clean up the garbage.

The solution in our case was to force a manual garbage collection at the end of each unit of work in our script, as follows:

gc.collect()

This brought memory under control.

Proving that the particular code that seemed to be leaking wasn't leaking was confirmed with the tracemalloc library. The garbage was collected, snapshots were taken, and the snapshots then compared before and after to prove that no additional memory was being allocated.

for _ in range(10000):

gc.collect();
snapshot1 = tracemalloc.take_snapshot()

response = test_parsing("assets.xml")
del response

gc.collect();
snapshot2 = tracemalloc.take_snapshot()

top_stats = snapshot2.compare_to(snapshot1, 'lineno')
print("[ Non Zero differences ]")
for stat in top_stats:
if (stat.size_diff != 0):
print(stat)

In our case the Non Zero differences list above was empty after each iteration, proving there was no leak.

Is there a memory leak or do I not understand garbage collection and memory management?

The problem was solved by the author of the Griddly library that I am using. The cloned environments weren't reachable by Python garbage collection and there was a memory leak in the underlying C++ implementation.

How can I prevent this memory leak?

The problem is neither in Python nor its interface to C++. The problem is in Box2D, which is used by some of the OpenAI Gym environments.

I can repeat the above code while creating a different environment that doesn't use Box2D (such as "CartPole-v1") and let it run endlessly without any memory leak. As soon as I put a Box2D environment back in (such as "BipedalWalker-v3" or "LunarLander-v2"), the memory leak comes back.

I can repeat the above process entirely in Python and get the same results. Even with manually running garbage collection after every environment destruction, the memory allocated by the application grows without limit.

The reset function on any environment is where it does a lot of preparation to run and it is where the memory leak occurs. If Box2D environments are created and destroyed endlessly, there is no memory leak. Created, reset, then destroyed? Memory leak.

Thank you all for the help, but this is a bug in the underlying library. I'll need to go submit it there.



Related Topics



Leave a reply



Submit