How to Diagnose a Python Process Chewing CPU in Linux

Python: How to debug high CPU consumption on Linux?

Answering my own question in the hope others might find it useful in the future.

I've been able to use py-spy's top command on my program as it was running inside the container and its output has showed me that the hupper library is the cause of the high CPU usage.

Here are the steps that I've taken:

I had to spin up a new container with the SYS_PTRACE capability otherwise py-spy can't run inside it:
```
docker run -it --cap-add SYS_PTRACE python:3.9.13-buster /bin/bash
```
I cloned my app into the container, built it, and ran it.
I ran top to see what the process id is for my program (identifiable by its high CPU usage).
I ran py-spy top --pid <process id> to see a live table showing me the most active parts of my code.

Similar to this:

(python v3.9.13)
Total Samples 78500
GIL: 15.00%, Active: 103.00%, Threads: 4

  %Own   %Total  OwnTime  TotalTime  Function (filename)
 92.00%  92.00%   717.0s    717.0s   get_mtime (hupper/polling.py)
  3.00%   3.00%   41.27s    67.08s   walk (dash/_watch.py)
  7.00%  99.00%   35.86s    752.9s   check_reload (hupper/polling.py)
  1.00% 100.00%   20.21s    773.1s   run (hupper/polling.py)
  0.00%   0.00%   17.08s    18.78s   _walk (os.py)
  0.00%   0.00%    4.24s     4.24s   <listcomp> (dash/_watch.py)
  0.00%   0.00%    2.49s     2.89s   join (posixpath.py)
  0.00%   0.00%    1.60s     1.60s   islink (posixpath.py)
  0.00%   0.00%   0.400s    0.400s   _get_sep (posixpath.py)
  0.00%   3.00%   0.210s    67.29s   watch (dash/_watch.py)
  0.00%   0.00%   0.030s    0.030s   _run_worker (hupper/reloader.py)
  0.00%   0.00%   0.020s    0.020s   _recv_packet (hupper/ipc.py)
  0.00%   3.00%   0.000s    67.31s   run (threading.py)
  0.00%   0.00%   0.000s    0.030s   start_reloader (hupper/reloader.py)
  0.00%   0.00%   0.000s    0.030s   <module> (<string>)
  0.00%   3.00%   0.000s    67.29s   <lambda> (dash/dash.py)
  0.00%   0.00%   0.000s    0.030s   serve (myapp/cli/main.py)
  0.00% 103.00%   0.000s    840.4s   _bootstrap (threading.py)
  0.00%   0.00%   0.000s    0.030s   serve_dashboard (myapp/dashboard/__init__.py)
  0.00%   0.00%   0.000s    0.030s   wrapper (myapp/cli/main.py)
  0.00%   0.00%   0.000s    0.020s   _read_loop (hupper/ipc.py)
  0.00%   0.00%   0.000s    0.030s   main (myapp/cli/main.py)
  0.00%   0.00%   0.000s    0.030s   run (hupper/reloader.py)

Python code for finding top 5 processes using cpu

psutil is the one I would recommend. From the PyPI site, here is the package description,

psutil (process and system utilities) is a cross-platform library for
retrieving information on running processes and system utilization
(CPU, memory, disks, network, sensors) in Python. It is useful
mainly for system monitoring, profiling and limiting process resources
and management of running processes. It implements many
functionalities offered by UNIX command line tools such as: ps, top,
lsof, netstat, ifconfig, who, df, kill, free, nice, ionice, iostat,
iotop, uptime, pidof, tty, taskset, pmap.

To address your problem statement, here is the sample I took from the site which list outs process sorted by memory - https://thispointer.com/python-get-list-of-all-running-processes-and-sort-by-highest-memory-usage/

def getListOfProcessSortedByMemory():
    '''
    Get list of running process sorted by Memory Usage
    '''
    listOfProcObjects = []
    # Iterate over the list
    for proc in psutil.process_iter():
       try:
           # Fetch process details as dict
           pinfo = proc.as_dict(attrs=['pid', 'name', 'username'])
           pinfo['vms'] = proc.memory_info().vms / (1024 * 1024)
           # Append dict to list
           listOfProcObjects.append(pinfo);
       except (psutil.NoSuchProcess, psutil.AccessDenied, psutil.ZombieProcess):
           pass

    # Sort list of dict by key vms i.e. memory usage
    listOfProcObjects = sorted(listOfProcObjects, key=lambda procObj: procObj['vms'], reverse=True)

    return listOfProcObjects

Complete code is available in the above site; you may need to tweak accordingly to list based on cpu usage. Hope this helps.

High Kernel CPU when running multiple python programs

If the problem exists in kernel, you should narrow down a problem using a profiler such as OProfile or perf.

I.e. run perf record -a -g and than read profiling data saved into perf data using perf report. See also: linux perf: how to interpret and find hotspots.

In your case high CPU usage is caused by competition for /dev/urandom -- it allows only one thread to read from it, but multiple Python processes are doing so.

Python module random is using it only for initialization. I.e:

$ strace python -c 'import random;
while True:
    random.random()'
open("/dev/urandom", O_RDONLY)     = 4
read(4, "\16\36\366\36}"..., 2500) = 2500
close(4)                                   <--- /dev/urandom is closed

You may also explicitly ask for /dev/urandom by using os.urandom or SystemRandom class. So check your code which is dealing with random numbers.

Retrieve CPU usage and memory usage of a single process on Linux?

ps -p <pid> -o %cpu,%mem,cmd

(You can leave off "cmd" but that might be helpful in debugging).

Note that this gives average CPU usage of the process over the time it has been running.

How to get current CPU and RAM usage in Python?

The psutil library gives you information about CPU, RAM, etc., on a variety of platforms:

psutil is a module providing an interface for retrieving information on running processes and system utilization (CPU, memory) in a portable way by using Python, implementing many functionalities offered by tools like ps, top and Windows task manager.
It currently supports Linux, Windows, OSX, Sun Solaris, FreeBSD, OpenBSD and NetBSD, both 32-bit and 64-bit architectures, with Python versions from 2.6 to 3.5 (users of Python 2.4 and 2.5 may use 2.1.3 version).

Some examples:

#!/usr/bin/env python
import psutil
# gives a single float value
psutil.cpu_percent()
# gives an object with many fields
psutil.virtual_memory()
# you can convert that object to a dictionary 
dict(psutil.virtual_memory()._asdict())
# you can have the percentage of used RAM
psutil.virtual_memory().percent
79.2
# you can calculate percentage of available memory
psutil.virtual_memory().available * 100 / psutil.virtual_memory().total
20.8

Here's other documentation that provides more concepts and interest concepts:

https://psutil.readthedocs.io/en/latest/

How to monitor usage of CPU when function is called in Python psutil?

You can use threading to run iHateThis or to run function with cpu_percent(). I choose second version. I will run cpu_percent() in thread.

Because it uses while True so thread would run forever and there wouldn't be nice method to stop thread so I use global variaable running with while running to have method to stop this loop.

import threading
import psutil

def display_cpu():
    global running

    running = True

    currentProcess = psutil.Process()

    # start loop
    while running:
        print(currentProcess.cpu_percent(interval=1))

def start():
    global t

    # create thread and start it
    t = threading.Thread(target=display_cpu)
    t.start()

def stop():
    global running
    global t

    # use `running` to stop loop in thread so thread will end
    running = False

    # wait for thread's end
    t.join()

and now I can use it to start and stop thread which will display CPU. Because I may have to stop process using Ctrl+C so it will raise error so I use try/finally to stop thread even if there will be error.

def i_hate_this():
    tab = []
    for i in range(1000000):
        tab.append(random.randint(1, 10000))
    tab.sort()
    return tab

# ---

start()
try:
    result = i_hate_this()
finally: # stop thread even if I press Ctrl+C
    stop()

Full code:

import random
import threading
import psutil

def display_cpu():
    global running

    running = True

    currentProcess = psutil.Process()

    # start loop
    while running:
        print(currentProcess.cpu_percent(interval=1))

def start():
    global t

    # create thread and start it
    t = threading.Thread(target=display_cpu)
    t.start()

def stop():
    global running
    global t

    # use `running` to stop loop in thread so thread will end
    running = False

    # wait for thread's end
    t.join()

# ---

def i_hate_this():
    tab = []
    for i in range(1000000):
        tab.append(random.randint(1, 10000))
    tab.sort()
    return tab

# ---

start()
try:
    result = i_hate_this()
finally: # stop thread even if I press Ctrl+C
    stop()

BTW: this can be converted to class which inherits from class Thread and then it can hide variable running in class.

import psutil
import random
import threading

class DisplayCPU(threading.Thread):

    def run(self):

        self.running = True

        currentProcess = psutil.Process()

        while self.running:
            print(currentProcess.cpu_percent(interval=1))

    def stop(self):
        self.running = False

# ----

def i_hate_this():
    tab = []
    for i in range(1000000):
        tab.append(random.randint(1, 10000))
    tab.sort()
    return tab

# ---

display_cpu = DisplayCPU()

display_cpu.start()
try:
    result = i_hate_this()
finally: # stop thread even when I press Ctrl+C
    display_cpu.stop()

It could be also converted to context manager to run it as

with display_cpu():
    i_hate_this()

but I skip this part.

How to get CPU usage in python 2.7 without using PSUtil

What PSUtil, htop, mpstat and the like do, is reading the line starting with "cpu" (actually the first line) from /proc/stat, and then calculate a percentage from the values in that line. You can find the meaning of the values on that line in man 5 proc (search for "proc/stat").

That's also what the grep cpu /proc/stat | awk .... command you mentioned does. But the values in /proc/stat represent the times spent since last boot! Maybe they wrap around after a while, I'm not sure, but the point is that these are numbers measured over a really long time.

So if you run that command, and run it again a few seconds (, minutes or even hours) later, they won't have changed much! That's why you saw it always return 5.12.

Programs like top remember the previous values and subtract them from the newly read values. From the result a 'live' percentage can be calculated that actually reflect recent CPU load.

To do something like that in python as simply as possible, but without running external commands to read /proc/stat and do the calculations for us, we can store the values we've read into a file. The next run we can read them back in, and subtract them from the new values.

#!/usr/bin/env python2.7

import os.path

# Read first line from /proc/stat. It should start with "cpu"
# and contains times spent in various modes by all CPU's totalled.
#
with open("/proc/stat") as procfile:
    cpustats = procfile.readline().split()

# Sanity check
#
if cpustats[0] != 'cpu':
    raise ValueError("First line of /proc/stat not recognised")

#
# Refer to "man 5 proc" (search for /proc/stat) for information
# about which field means what.
#
# Here we do calculation as simple as possible:
# CPU% = 100 * time_doing_things / (time_doing_things + time_doing_nothing)
#

user_time = int(cpustats[1])    # time spent in user space
nice_time = int(cpustats[2])    # 'nice' time spent in user space
system_time = int(cpustats[3])  # time spent in kernel space

idle_time = int(cpustats[4])    # time spent idly
iowait_time = int(cpustats[5])    # time spent waiting is also doing nothing

time_doing_things = user_time + nice_time + system_time
time_doing_nothing = idle_time + iowait_time

# The times read from /proc/stat are total times, i.e. *all* times spent
# doing things and doing nothing since last boot.
#
# So to calculate  meaningful CPU % we need to know how much these values
# have *changed*.  So we store them in a file which we read next time the
# script is run.
# 
previous_values_file = "/tmp/prev.cpu"
prev_time_doing_things = 0
prev_time_doing_nothing = 0

try:
    with open(previous_values_file) as prev_file:
        prev1, prev2 = prev_file.readline().split()
    prev_time_doing_things = int(prev1)
    prev_time_doing_nothing = int(prev2)
except IOError:  # To prevent error/exception if file does not exist. We don't care.
    pass   

# Write the new values to the file to use next run
#
with open(previous_values_file, 'w') as prev_file:
    prev_file.write("{} {}\n".format(time_doing_things, time_doing_nothing))

# Calculate difference, i.e: how much the number have changed
#
diff_time_doing_things = time_doing_things - prev_time_doing_things
diff_time_doing_nothing = time_doing_nothing - prev_time_doing_nothing

# Calculate a percentage of change since last run:
#
cpu_percentage = 100.0 * diff_time_doing_things/ (diff_time_doing_things + diff_time_doing_nothing)

# Finally, output the result
#
print "CPU", cpu_percentage, "%"

Here's a version that, not unlike top, prints CPU usage every second, remembering CPU times from previous measurement in variables instead of a file:

#!/usr/bin/env python2.7

import os.path
import time

def get_cpu_times():
    # Read first line from /proc/stat. It should start with "cpu"
    # and contains times spend in various modes by all CPU's totalled.
    #
    with open("/proc/stat") as procfile:
        cpustats = procfile.readline().split()

    # Sanity check
    #
    if cpustats[0] != 'cpu':
        raise ValueError("First line of /proc/stat not recognised")

    # Refer to "man 5 proc" (search for /proc/stat) for information
    # about which field means what.
    #
    # Here we do calculation as simple as possible:
    #
    # CPU% = 100 * time-doing-things / (time_doing_things + time_doing_nothing)
    #

    user_time = int(cpustats[1])    # time spent in user space
    nice_time = int(cpustats[2])    # 'nice' time spent in user space
    system_time = int(cpustats[3])  # time spent in kernel space

    idle_time = int(cpustats[4])    # time spent idly
    iowait_time = int(cpustats[5])    # time spent waiting is also doing nothing

    time_doing_things = user_time + nice_time + system_time
    time_doing_nothing = idle_time + iowait_time

    return time_doing_things, time_doing_nothing

def cpu_percentage_loop():
    prev_time_doing_things = 0
    prev_time_doing_nothing = 0
    while True:  # loop forever printing CPU usage percentage
        time_doing_things, time_doing_nothing = get_cpu_times()
        diff_time_doing_things = time_doing_things - prev_time_doing_things
        diff_time_doing_nothing = time_doing_nothing - prev_time_doing_nothing
        cpu_percentage = 100.0 * diff_time_doing_things/ (diff_time_doing_things + diff_time_doing_nothing)

        # remember current values to subtract next iteration of the loop
        #
        prev_time_doing_things = time_doing_things
        prev_time_doing_nothing = time_doing_nothing

        # Output latest perccentage
        #
        print "CPU", cpu_percentage, "%"

        # Loop delay
        #
        time.sleep(1)

if __name__ == "__main__":
    cpu_percentage_loop()