How to Limit Memory Usage Within a Python Process

Limit RAM usage to python program

I've done some research and found a function to get the memory from Linux systems here: Determine free RAM in Python and I modified it a bit to get just the free memory available and set the maximum memory available as its half.

Code:

import resource
import sys

def memory_limit():
soft, hard = resource.getrlimit(resource.RLIMIT_AS)
resource.setrlimit(resource.RLIMIT_AS, (get_memory() * 1024 / 2, hard))

def get_memory():
with open('/proc/meminfo', 'r') as mem:
free_memory = 0
for i in mem:
sline = i.split()
if str(sline[0]) in ('MemFree:', 'Buffers:', 'Cached:'):
free_memory += int(sline[1])
return free_memory

if __name__ == '__main__':
memory_limit() # Limitates maximun memory usage to half
try:
main()
except MemoryError:
sys.stderr.write('\n\nERROR: Memory Exception\n')
sys.exit(1)

The line to set it to the half is resource.setrlimit(resource.RLIMIT_AS, (get_memory() * 1024 / 2, hard)) where get_memory() * 1024 / 2 sets it to the half (it's in bytes).

How can I limit memory usage for a Python script via command line?

You can use ulimit on Linux systems. (Within Python, there's also resource.setrlimit() to limit the current process.)

Something like this (sorry, my Bash is rusty) should be a decent enough wrapper:

#!/bin/bash
ulimit -m 10240 # kilobytes
exec python3 $@

Then run e.g. that-wrapper.sh student-script.py.

(That said, are you sure you can trust your students not to submit something that uploads your secret SSH keys and/or trashes your file system? I'd suggest a stronger sandbox such as running everything in a Docker container.)

Running a Python process with limited memory

Using setrlimit to set maximum memory size

As per @Alexis Drakopoulos's answer, the resource module can be used to set the maximum amount of virtual memory used by a Python script, with the caveat that this approach only works on Linux-based systems, and does not work on BSD-based systems like Mac OS X.

To modify the limit, add the following call to setrlimit in your Python script:

resource.setrlimit(resource.RLIMIT_AS, (soft_lim, hard_lim))

(where the soft/hard limits are units of bytes - and their values are usually equal).

Quick example

Here is a quick example that limits the memory to about 1 MB, then fails to import pandas due to a memory error:

$ python
Python 3.6.9 (default, Nov 7 2019, 10:44:02)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import resource

>>> print(resource.getrlimit(resource.RLIMIT_AS))
(-1, -1)

>>> resource.setrlimit(resource.RLIMIT_AS, (1000,1000))

>>> import pandas as pd
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/vagrant/.local/lib/python3.6/site-packages/pandas/__init__.py", line 11, in <module>
File "/home/vagrant/.local/lib/python3.6/site-packages/numpy/__init__.py", line 142, in <module>
File "/home/vagrant/.local/lib/python3.6/site-packages/numpy/core/__init__.py", line 24, in <module>
File "<frozen importlib._bootstrap>", line 971, in _find_and_load
File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 674, in exec_module
File "<frozen importlib._bootstrap_external>", line 779, in get_code
File "<frozen importlib._bootstrap_external>", line 487, in _compile_bytecode
MemoryError
MemoryError

Why doesn't this work on BSD systems?

If you check the man page for getrlimit or setrlimit, you'll see a list of RLIMIT_* variables - but the list is different between BSD and Linux. The Linux getrlimit/setrlimit man page lists RLIMIT_AS, but the BSD getrlimit/setrlimit man page does not list any RLIMIT variable for controlling the amount of memory. So, even though resource.RLIMIT_AS is defined in the resource module on Mac OS X, setting it has no effect on the kernel or on the amount of memory available to the process.

Also see What do the two numbers returned by Python's resource.RLIMIT_VMEM (or resource.RLIMIT_AS) mean?

How to limit memory usage within a python process

resource.RLIMIT_VMEM is the resource corresponding to ulimit -v.

RLIMIT_DATA only affects brk/sbrk system calls while newer memory managers tend to use mmap instead.

The second thing to note is that ulimit/setrlimit only affects the current process and its future children.

Regarding the AttributeError: 'module' object has no attribute 'RLIMIT_VMEM' message: the resource module docs mention this possibility:

This module does not attempt to mask platform differences — symbols
not defined for a platform will not be available from this module on
that platform.

According to the bash ulimit source linked to above, it uses RLIMIT_AS if RLIMIT_VMEM is not defined.

Limit python script RAM usage in Windows

A Job object supports limiting the committed memory of a process. In Python, we can implement this via PyWin32 or ctypes.

Note that prior to Windows 8 a process can only be in a single Job. A couple of common cases where this is a concern include the py.exe launcher (the default association for .py files), which runs python.exe in a Job, and the Task Scheduler service, which runs each task in a Job.

PyWin32 Example

import sys
import warnings

import winerror
import win32api
import win32job

g_hjob = None

def create_job(job_name='', breakaway='silent'):
hjob = win32job.CreateJobObject(None, job_name)
if breakaway:
info = win32job.QueryInformationJobObject(hjob,
win32job.JobObjectExtendedLimitInformation)
if breakaway == 'silent':
info['BasicLimitInformation']['LimitFlags'] |= (
win32job.JOB_OBJECT_LIMIT_SILENT_BREAKAWAY_OK)
else:
info['BasicLimitInformation']['LimitFlags'] |= (
win32job.JOB_OBJECT_LIMIT_BREAKAWAY_OK)
win32job.SetInformationJobObject(hjob,
win32job.JobObjectExtendedLimitInformation, info)
return hjob

def assign_job(hjob):
global g_hjob
hprocess = win32api.GetCurrentProcess()
try:
win32job.AssignProcessToJobObject(hjob, hprocess)
g_hjob = hjob
except win32job.error as e:
if (e.winerror != winerror.ERROR_ACCESS_DENIED or
sys.getwindowsversion() >= (6, 2) or
not win32job.IsProcessInJob(hprocess, None)):
raise
warnings.warn('The process is already in a job. Nested jobs are not '
'supported prior to Windows 8.')

def limit_memory(memory_limit):
if g_hjob is None:
return
info = win32job.QueryInformationJobObject(g_hjob,
win32job.JobObjectExtendedLimitInformation)
info['ProcessMemoryLimit'] = memory_limit
info['BasicLimitInformation']['LimitFlags'] |= (
win32job.JOB_OBJECT_LIMIT_PROCESS_MEMORY)
win32job.SetInformationJobObject(g_hjob,
win32job.JobObjectExtendedLimitInformation, info)

def main():
assign_job(create_job())
memory_limit = 100 * 1024 * 1024 # 100 MiB
limit_memory(memory_limit)
try:
bytearray(memory_limit)
except MemoryError:
print('Success: available memory is limited.')
else:
print('Failure: available memory is not limited.')
return 0

if __name__ == '__main__':
sys.exit(main())


Related Topics



Leave a reply



Submit