How to Profile Memory Usage in Python

How do I profile memory usage in Python?

This one has been answered already here: Python memory profiler

Basically you do something like that (cited from Guppy-PE):

>>> from guppy import hpy; h=hpy()
>>> h.heap()
Partition of a set of 48477 objects. Total size = 3265516 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0  25773  53  1612820  49   1612820  49 str
     1  11699  24   483960  15   2096780  64 tuple
     2    174   0   241584   7   2338364  72 dict of module
     3   3478   7   222592   7   2560956  78 types.CodeType
     4   3296   7   184576   6   2745532  84 function
     5    401   1   175112   5   2920644  89 dict of class
     6    108   0    81888   3   3002532  92 dict (no owner)
     7    114   0    79632   2   3082164  94 dict of type
     8    117   0    51336   2   3133500  96 type
     9    667   1    24012   1   3157512  97 __builtin__.wrapper_descriptor
<76 more rows. Type e.g. '_.more' to view.>
>>> h.iso(1,[],{})
Partition of a set of 3 objects. Total size = 176 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0      1  33      136  77       136  77 dict (no owner)
     1      1  33       28  16       164  93 list
     2      1  33       12   7       176 100 int
>>> x=[]
>>> h.iso(x).sp
 0: h.Root.i0_modules['__main__'].__dict__['x']
>>>

Which Python memory profiler is recommended?

guppy3 is quite simple to use. At some point in your code, you have to write the following:

from guppy import hpy
h = hpy()
print(h.heap())

This gives you some output like this:

Partition of a set of 132527 objects. Total size = 8301532 bytes.
Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
0  35144  27  2140412  26   2140412  26 str
1  38397  29  1309020  16   3449432  42 tuple
2    530   0   739856   9   4189288  50 dict (no owner)

You can also find out from where objects are referenced and get statistics about that, but somehow the docs on that are a bit sparse.

There is a graphical browser as well, written in Tk.

For Python 2.x, use Heapy.

A module to profile peak memory usage of Python code

An easy way to use memory_usage to get the peak / maximum memory from a block of code is to first put that code in a function, and then pass that function - without the () call - to memory_usage() as the proc argument:

from memory_profiler import memory_usage

def myfunc():
  # code
  return
    
mem = max(memory_usage(proc=myfunc))

print("Maximum memory used: {} MiB".format(mem))

Other arguments allow you to collect timestamps, return values, pass arguments to myfunc, etc. The docstring seems to be the only complete source for documentation on this: https://github.com/fabianp/memory_profiler/blob/master/memory_profiler.py

https://github.com/fabianp/memory_profiler/blob/4089e3ed4d5c4197925a2df8393d4cbfca745ae5/memory_profiler.py#L244

Total memory used by Python process?

Here is a useful solution that works for various operating systems, including Linux, Windows, etc.:

import os, psutil
process = psutil.Process(os.getpid())
print(process.memory_info().rss)  # in bytes

Notes:

do pip install psutil if it is not installed yet

handy one-liner if you quickly want to know how many MB your process takes:

import os, psutil; print(psutil.Process(os.getpid()).memory_info().rss / 1024 ** 2)

with Python 2.7 and psutil 5.6.3, it was process.memory_info()[0] instead (there was a change in the API later).

Tracking maximum memory usage by a Python function

This question seemed rather interesting and it gave me a reason to look into Guppy / Heapy, for that I thank you.

I tried for about 2 hours to get Heapy to do monitor a function call / process without modifying its source with zero luck.

I did find a way to accomplish your task using the built in Python library resource. Note that the documentation does not indicate what the RU_MAXRSS value returns. Another SO user noted that it was in kB. Running Mac OSX 7.3 and watching my system resources climb up during the test code below, I believe the returned values to be in Bytes, not kBytes.

A 10000ft view on how I used the resource library to monitor the library call was to launch the function in a separate (monitor-able) thread and track the system resources for that process in the main thread. Below I have the two files that you'd need to run to test it out.

Library Resource Monitor - whatever_you_want.py

import resource
import time

from stoppable_thread import StoppableThread

class MyLibrarySniffingClass(StoppableThread):
    def __init__(self, target_lib_call, arg1, arg2):
        super(MyLibrarySniffingClass, self).__init__()
        self.target_function = target_lib_call
        self.arg1 = arg1
        self.arg2 = arg2
        self.results = None

    def startup(self):
        # Overload the startup function
        print "Calling the Target Library Function..."

    def cleanup(self):
        # Overload the cleanup function
        print "Library Call Complete"

    def mainloop(self):
        # Start the library Call
        self.results = self.target_function(self.arg1, self.arg2)

        # Kill the thread when complete
        self.stop()

def SomeLongRunningLibraryCall(arg1, arg2):
    max_dict_entries = 2500
    delay_per_entry = .005

    some_large_dictionary = {}
    dict_entry_count = 0

    while(1):
        time.sleep(delay_per_entry)
        dict_entry_count += 1
        some_large_dictionary[dict_entry_count]=range(10000)

        if len(some_large_dictionary) > max_dict_entries:
            break

    print arg1 + " " +  arg2
    return "Good Bye World"

if __name__ == "__main__":
    # Lib Testing Code
    mythread = MyLibrarySniffingClass(SomeLongRunningLibraryCall, "Hello", "World")
    mythread.start()

    start_mem = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
    delta_mem = 0
    max_memory = 0
    memory_usage_refresh = .005 # Seconds

    while(1):
        time.sleep(memory_usage_refresh)
        delta_mem = (resource.getrusage(resource.RUSAGE_SELF).ru_maxrss) - start_mem
        if delta_mem > max_memory:
            max_memory = delta_mem

        # Uncomment this line to see the memory usuage during run-time 
        # print "Memory Usage During Call: %d MB" % (delta_mem / 1000000.0)

        # Check to see if the library call is complete
        if mythread.isShutdown():
            print mythread.results
            break;

    print "\nMAX Memory Usage in MB: " + str(round(max_memory / 1000.0, 3))

Stoppable Thread - stoppable_thread.py

import threading
import time

class StoppableThread(threading.Thread):
    def __init__(self):
        super(StoppableThread, self).__init__()
        self.daemon = True
        self.__monitor = threading.Event()
        self.__monitor.set()
        self.__has_shutdown = False

    def run(self):
        '''Overloads the threading.Thread.run'''
        # Call the User's Startup functions
        self.startup()

        # Loop until the thread is stopped
        while self.isRunning():
            self.mainloop()

        # Clean up
        self.cleanup()

        # Flag to the outside world that the thread has exited
        # AND that the cleanup is complete
        self.__has_shutdown = True

    def stop(self):
        self.__monitor.clear()

    def isRunning(self):
        return self.__monitor.isSet()

    def isShutdown(self):
        return self.__has_shutdown

    ###############################
    ### User Defined Functions ####
    ###############################

    def mainloop(self):
        '''
        Expected to be overwritten in a subclass!!
        Note that Stoppable while(1) is handled in the built in "run".
        '''
        pass

    def startup(self):
        '''Expected to be overwritten in a subclass!!'''
        pass

    def cleanup(self):
        '''Expected to be overwritten in a subclass!!'''
        pass

Profiling memory usage in python generators

The following snippet should work in python3. If you replace the generator function with a list you will see a large difference in memory allocation.

from memory_profiler import memory_usage

print(f'Memory usage after: {memory_usage()}MB')

def obj_generator(num_objects):
    for i in range(num_objects):
        new_obj = {
            'id' : i,
        }
        yield new_obj

objects = obj_generator(1_000_000)

print(f'Memory usage before: {memory_usage()}MB')

How to Profile Memory Usage in Python

How do I profile memory usage in Python?

Which Python memory profiler is recommended?

A module to profile peak memory usage of Python code

Total memory used by Python process?

Tracking maximum memory usage by a Python function

Profiling memory usage in python generators

Related Topics

Leave a reply

How do I profile memory usage in Python?

Which Python memory profiler is recommended?

A module to profile peak memory usage of Python code

Total memory used by Python process?

Tracking *maximum* memory usage by a Python function

Profiling memory usage in python generators

Related Topics

Leave a reply

Tracking maximum memory usage by a Python function