Python memory leaks
Have a look at this article: Tracing python memory leaks
Also, note that the garbage collection module actually can have debug flags set. Look at the set_debug
function. Additionally, look at this code by Gnibbler for determining the types of objects that have been created after a call.
What is the best way to detect memory leaks in a PyQt application?
Memory Profiling Using tracemalloc
tracemalloc
is a package included in the Python standard library.It provides detailed, block-level traces of memory allocation, including the full traceback to the line where the memory allocation occurred, and statistics for the overall memory behavior of a program.
tracemalloc
can be used to locate high-memory-usage areas of code in two ways:
- looking at cumulative statistics on memory use to identify which object allocations are using the most memory, and
- tracing execution frames to identify where those objects are allocated in the code.
A link for the documentation and PEP are below. Both provide excellent instructions on how to detect anomalies in Pythons memory management.
Tracemalloc Documentation Here
PEP454
Under what circumstances could an issue that looks like a python memory leak not be a leak?
In our case the cause of what looked like a leak was our python code consuming RAM faster than the python garbage collector was willing to clean up the garbage.
The solution in our case was to force a manual garbage collection at the end of each unit of work in our script, as follows:
gc.collect()
This brought memory under control.
Proving that the particular code that seemed to be leaking wasn't leaking was confirmed with the tracemalloc library. The garbage was collected, snapshots were taken, and the snapshots then compared before and after to prove that no additional memory was being allocated.
for _ in range(10000):
gc.collect();
snapshot1 = tracemalloc.take_snapshot()
response = test_parsing("assets.xml")
del response
gc.collect();
snapshot2 = tracemalloc.take_snapshot()
top_stats = snapshot2.compare_to(snapshot1, 'lineno')
print("[ Non Zero differences ]")
for stat in top_stats:
if (stat.size_diff != 0):
print(stat)
In our case the Non Zero differences list above was empty after each iteration, proving there was no leak.
Is there a memory leak or do I not understand garbage collection and memory management?
The problem was solved by the author of the Griddly library that I am using. The cloned environments weren't reachable by Python garbage collection and there was a memory leak in the underlying C++ implementation.
How can I prevent this memory leak?
The problem is neither in Python nor its interface to C++. The problem is in Box2D, which is used by some of the OpenAI Gym environments.
I can repeat the above code while creating a different environment that doesn't use Box2D (such as "CartPole-v1") and let it run endlessly without any memory leak. As soon as I put a Box2D environment back in (such as "BipedalWalker-v3" or "LunarLander-v2"), the memory leak comes back.
I can repeat the above process entirely in Python and get the same results. Even with manually running garbage collection after every environment destruction, the memory allocated by the application grows without limit.
The reset function on any environment is where it does a lot of preparation to run and it is where the memory leak occurs. If Box2D environments are created and destroyed endlessly, there is no memory leak. Created, reset, then destroyed? Memory leak.
Thank you all for the help, but this is a bug in the underlying library. I'll need to go submit it there.
Related Topics
How to Use Pip with Python 3.X Alongside Python 2.X
How to Put Individual Tags for a Matplotlib Scatter Plot
Returning the Product of a List
Python Split() Without Removing the Delimiter
Large, Persistent Dataframe in Pandas
How to Split and Parse a String in Python
String Concatenation Without '+' Operator
Pandas Convert Dataframe to Array of Tuples
Why Is Parenthesis in Print Voluntary in Python 2.7
How Could I Use Requests in Asyncio
Cmd Opens Windows Store When I Type 'Python'
Assigning to Variable from Parent Function: "Local Variable Referenced Before Assignment"
How to Merge a Transparent Png Image with Another Image Using Pil