How do I profile memory usage in Python?
This one has been answered already here: Python memory profiler
Basically you do something like that (cited from Guppy-PE):
>>> from guppy import hpy; h=hpy()
>>> h.heap()
Partition of a set of 48477 objects. Total size = 3265516 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 25773 53 1612820 49 1612820 49 str
1 11699 24 483960 15 2096780 64 tuple
2 174 0 241584 7 2338364 72 dict of module
3 3478 7 222592 7 2560956 78 types.CodeType
4 3296 7 184576 6 2745532 84 function
5 401 1 175112 5 2920644 89 dict of class
6 108 0 81888 3 3002532 92 dict (no owner)
7 114 0 79632 2 3082164 94 dict of type
8 117 0 51336 2 3133500 96 type
9 667 1 24012 1 3157512 97 __builtin__.wrapper_descriptor
<76 more rows. Type e.g. '_.more' to view.>
>>> h.iso(1,[],{})
Partition of a set of 3 objects. Total size = 176 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 1 33 136 77 136 77 dict (no owner)
1 1 33 28 16 164 93 list
2 1 33 12 7 176 100 int
>>> x=[]
>>> h.iso(x).sp
0: h.Root.i0_modules['__main__'].__dict__['x']
>>>
Which Python memory profiler is recommended?
guppy3 is quite simple to use. At some point in your code, you have to write the following:
from guppy import hpy
h = hpy()
print(h.heap())
This gives you some output like this:
Partition of a set of 132527 objects. Total size = 8301532 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 35144 27 2140412 26 2140412 26 str
1 38397 29 1309020 16 3449432 42 tuple
2 530 0 739856 9 4189288 50 dict (no owner)
You can also find out from where objects are referenced and get statistics about that, but somehow the docs on that are a bit sparse.
There is a graphical browser as well, written in Tk.
For Python 2.x, use Heapy.
A module to profile peak memory usage of Python code
An easy way to use memory_usage
to get the peak / maximum memory from a block of code is to first put that code in a function, and then pass that function - without the () call - to memory_usage()
as the proc
argument:
from memory_profiler import memory_usage
def myfunc():
# code
return
mem = max(memory_usage(proc=myfunc))
print("Maximum memory used: {} MiB".format(mem))
Other arguments allow you to collect timestamps, return values, pass arguments to myfunc
, etc. The docstring seems to be the only complete source for documentation on this: https://github.com/fabianp/memory_profiler/blob/master/memory_profiler.py
https://github.com/fabianp/memory_profiler/blob/4089e3ed4d5c4197925a2df8393d4cbfca745ae5/memory_profiler.py#L244
Total memory used by Python process?
Here is a useful solution that works for various operating systems, including Linux, Windows, etc.:
import os, psutil
process = psutil.Process(os.getpid())
print(process.memory_info().rss) # in bytes
Notes:
do
pip install psutil
if it is not installed yethandy one-liner if you quickly want to know how many MB your process takes:
import os, psutil; print(psutil.Process(os.getpid()).memory_info().rss / 1024 ** 2)
with Python 2.7 and psutil 5.6.3, it was
process.memory_info()[0]
instead (there was a change in the API later).
Tracking *maximum* memory usage by a Python function
This question seemed rather interesting and it gave me a reason to look into Guppy / Heapy, for that I thank you.
I tried for about 2 hours to get Heapy to do monitor a function call / process without modifying its source with zero luck.
I did find a way to accomplish your task using the built in Python library resource
. Note that the documentation does not indicate what the RU_MAXRSS
value returns. Another SO user noted that it was in kB. Running Mac OSX 7.3 and watching my system resources climb up during the test code below, I believe the returned values to be in Bytes, not kBytes.
A 10000ft view on how I used the resource
library to monitor the library call was to launch the function in a separate (monitor-able) thread and track the system resources for that process in the main thread. Below I have the two files that you'd need to run to test it out.
Library Resource Monitor - whatever_you_want.py
import resource
import time
from stoppable_thread import StoppableThread
class MyLibrarySniffingClass(StoppableThread):
def __init__(self, target_lib_call, arg1, arg2):
super(MyLibrarySniffingClass, self).__init__()
self.target_function = target_lib_call
self.arg1 = arg1
self.arg2 = arg2
self.results = None
def startup(self):
# Overload the startup function
print "Calling the Target Library Function..."
def cleanup(self):
# Overload the cleanup function
print "Library Call Complete"
def mainloop(self):
# Start the library Call
self.results = self.target_function(self.arg1, self.arg2)
# Kill the thread when complete
self.stop()
def SomeLongRunningLibraryCall(arg1, arg2):
max_dict_entries = 2500
delay_per_entry = .005
some_large_dictionary = {}
dict_entry_count = 0
while(1):
time.sleep(delay_per_entry)
dict_entry_count += 1
some_large_dictionary[dict_entry_count]=range(10000)
if len(some_large_dictionary) > max_dict_entries:
break
print arg1 + " " + arg2
return "Good Bye World"
if __name__ == "__main__":
# Lib Testing Code
mythread = MyLibrarySniffingClass(SomeLongRunningLibraryCall, "Hello", "World")
mythread.start()
start_mem = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
delta_mem = 0
max_memory = 0
memory_usage_refresh = .005 # Seconds
while(1):
time.sleep(memory_usage_refresh)
delta_mem = (resource.getrusage(resource.RUSAGE_SELF).ru_maxrss) - start_mem
if delta_mem > max_memory:
max_memory = delta_mem
# Uncomment this line to see the memory usuage during run-time
# print "Memory Usage During Call: %d MB" % (delta_mem / 1000000.0)
# Check to see if the library call is complete
if mythread.isShutdown():
print mythread.results
break;
print "\nMAX Memory Usage in MB: " + str(round(max_memory / 1000.0, 3))
Stoppable Thread - stoppable_thread.py
import threading
import time
class StoppableThread(threading.Thread):
def __init__(self):
super(StoppableThread, self).__init__()
self.daemon = True
self.__monitor = threading.Event()
self.__monitor.set()
self.__has_shutdown = False
def run(self):
'''Overloads the threading.Thread.run'''
# Call the User's Startup functions
self.startup()
# Loop until the thread is stopped
while self.isRunning():
self.mainloop()
# Clean up
self.cleanup()
# Flag to the outside world that the thread has exited
# AND that the cleanup is complete
self.__has_shutdown = True
def stop(self):
self.__monitor.clear()
def isRunning(self):
return self.__monitor.isSet()
def isShutdown(self):
return self.__has_shutdown
###############################
### User Defined Functions ####
###############################
def mainloop(self):
'''
Expected to be overwritten in a subclass!!
Note that Stoppable while(1) is handled in the built in "run".
'''
pass
def startup(self):
'''Expected to be overwritten in a subclass!!'''
pass
def cleanup(self):
'''Expected to be overwritten in a subclass!!'''
pass
Profiling memory usage in python generators
The following snippet should work in python3. If you replace the generator function with a list you will see a large difference in memory allocation.
from memory_profiler import memory_usage
print(f'Memory usage after: {memory_usage()}MB')
def obj_generator(num_objects):
for i in range(num_objects):
new_obj = {
'id' : i,
}
yield new_obj
objects = obj_generator(1_000_000)
print(f'Memory usage before: {memory_usage()}MB')
Related Topics
How to Run Celery Workers by Superuser
Detect Face Then Autocrop Pictures
Python Equivalent of Ruby's 'Method_Missing'
Cannot Open Include File: 'Io.H': No Such File or Directory
Using Property() on Classmethods
What Are the Differences Between Numpy Arrays and Matrices? Which One Should I Use
Beautifulsoup Grab Visible Webpage Text
Matplotlib Scatterplot; Color as a Function of a Third Variable
Order of Keys in Dictionaries in Old Versions of Python
Why Use Python's Os Module Methods Instead of Executing Shell Commands Directly
How to Open (Read-Write) or Create a File with Truncation Allowed
Security of Python's Eval() on Untrusted Strings
Convert Base-2 Binary Number String to Int
I Can't Install Pyaudio on Windows? How to Solve "Error: Microsoft Visual C++ 14.0 Is Required."