Prevent R from Using Virtual Memory on Unix/Linux

Prevent R from using virtual memory on unix/linux?

When you run system("ulimit") that is executing in a child process. The parent does not inherit the ulimit from the parent. (This is analgous to doing system("cd dir"), or system("export ENV_VAR=foo").

Setting it in the shell from which you launch the environment is the correct way. The limit is not working in the parallel case most likely because it is a per-process limit, not a global system limit.

On Linux you can configure strict(er) overcommit accounting which tries to prevent the kernel from handling out a mmap request that cannot be backed by physical memory.

This is done by tuning the sysctl parameters vm.overcommit_memory and vm.overcommit_ratio. (Google about these.)

This can be an effective way to prevent thrashing situations. But the tradeoff is that you lose the benefit that overcommit provides when things are well-behaved (cramming more/larger processes into memory).

limiting memory usage in R under linux

There's unix::rlimit_as() that allows setting memory limits for a running R process using the same mechanism that is also used for ulimit in the shell. Windows and macOS not supported.

In my .Rprofile I have

unix::rlimit_as(1e12, 1e12)

to limit memory usage to ~12 GB.

Before that...

I had created a small R package, ulimit with similar functionality.

Install it from GitHub using

devtools::install_github("krlmlr/ulimit")

To limit the memory available to R to 2000 MiB, call:

ulimit::memory_limit(2000)

Now:

> rep(0L, 1e9)
Error: cannot allocate vector of size 3.7 Gb

Is possible to read virtual memory on Unix/Linux? And on Windows?

For Windows, if you need to read memory from a process, you'll need to request the PROCESS_VM_READ when you get your handle to the process (ReadProcessMemory is the appropriate call). In order to get that Handle, it's usually easier to start the process yourself with OpenProcess.

Virtual Memory Usage from Java under Linux, too much memory used

This has been a long-standing complaint with Java, but it's largely meaningless, and usually based on looking at the wrong information. The usual phrasing is something like "Hello World on Java takes 10 megabytes! Why does it need that?" Well, here's a way to make Hello World on a 64-bit JVM claim to take over 4 gigabytes ... at least by one form of measurement.


java -Xms1024m -Xmx4096m com.example.Hello

Different Ways to Measure Memory

On Linux, the top command gives you several different numbers for memory. Here's what it says about the Hello World example:


PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2120 kgregory 20 0 4373m 15m 7152 S 0 0.2 0:00.10 java
  • VIRT is the virtual memory space: the sum of everything in the virtual memory map (see below). It is largely meaningless, except when it isn't (see below).
  • RES is the resident set size: the number of pages that are currently resident in RAM. In almost all cases, this is the only number that you should use when saying "too big." But it's still not a very good number, especially when talking about Java.
  • SHR is the amount of resident memory that is shared with other processes. For a Java process, this is typically limited to shared libraries and memory-mapped JARfiles. In this example, I only had one Java process running, so I suspect that the 7k is a result of libraries used by the OS.
  • SWAP isn't turned on by default, and isn't shown here. It indicates the amount of virtual memory that is currently resident on disk, whether or not it's actually in the swap space. The OS is very good about keeping active pages in RAM, and the only cures for swapping are (1) buy more memory, or (2) reduce the number of processes, so it's best to ignore this number.

The situation for Windows Task Manager is a bit more complicated. Under Windows XP, there are "Memory Usage" and "Virtual Memory Size" columns, but the official documentation is silent on what they mean. Windows Vista and Windows 7 add more columns, and they're actually documented. Of these, the "Working Set" measurement is the most useful; it roughly corresponds to the sum of RES and SHR on Linux.

Understanding the Virtual Memory Map

The virtual memory consumed by a process is the total of everything that's in the process memory map. This includes data (eg, the Java heap), but also all of the shared libraries and memory-mapped files used by the program. On Linux, you can use the pmap command to see all of the things mapped into the process space (from here on out I'm only going to refer to Linux, because it's what I use; I'm sure there are equivalent tools for Windows). Here's an excerpt from the memory map of the "Hello World" program; the entire memory map is over 100 lines long, and it's not unusual to have a thousand-line list.


0000000040000000 36K r-x-- /usr/local/java/jdk-1.6-x64/bin/java
0000000040108000 8K rwx-- /usr/local/java/jdk-1.6-x64/bin/java
0000000040eba000 676K rwx-- [ anon ]
00000006fae00000 21248K rwx-- [ anon ]
00000006fc2c0000 62720K rwx-- [ anon ]
0000000700000000 699072K rwx-- [ anon ]
000000072aab0000 2097152K rwx-- [ anon ]
00000007aaab0000 349504K rwx-- [ anon ]
00000007c0000000 1048576K rwx-- [ anon ]
...
00007fa1ed00d000 1652K r-xs- /usr/local/java/jdk-1.6-x64/jre/lib/rt.jar
...
00007fa1ed1d3000 1024K rwx-- [ anon ]
00007fa1ed2d3000 4K ----- [ anon ]
00007fa1ed2d4000 1024K rwx-- [ anon ]
00007fa1ed3d4000 4K ----- [ anon ]
...
00007fa1f20d3000 164K r-x-- /usr/local/java/jdk-1.6-x64/jre/lib/amd64/libjava.so
00007fa1f20fc000 1020K ----- /usr/local/java/jdk-1.6-x64/jre/lib/amd64/libjava.so
00007fa1f21fb000 28K rwx-- /usr/local/java/jdk-1.6-x64/jre/lib/amd64/libjava.so
...
00007fa1f34aa000 1576K r-x-- /lib/x86_64-linux-gnu/libc-2.13.so
00007fa1f3634000 2044K ----- /lib/x86_64-linux-gnu/libc-2.13.so
00007fa1f3833000 16K r-x-- /lib/x86_64-linux-gnu/libc-2.13.so
00007fa1f3837000 4K rwx-- /lib/x86_64-linux-gnu/libc-2.13.so
...

A quick explanation of the format: each row starts with the virtual memory address of the segment. This is followed by the segment size, permissions, and the source of the segment. This last item is either a file or "anon", which indicates a block of memory allocated via mmap.

Starting from the top, we have

  • The JVM loader (ie, the program that gets run when you type java). This is very small; all it does is load in the shared libraries where the real JVM code is stored.
  • A bunch of anon blocks holding the Java heap and internal data. This is a Sun JVM, so the heap is broken into multiple generations, each of which is its own memory block. Note that the JVM allocates virtual memory space based on the -Xmx value; this allows it to have a contiguous heap. The -Xms value is used internally to say how much of the heap is "in use" when the program starts, and to trigger garbage collection as that limit is approached.
  • A memory-mapped JARfile, in this case the file that holds the "JDK classes." When you memory-map a JAR, you can access the files within it very efficiently (versus reading it from the start each time). The Sun JVM will memory-map all JARs on the classpath; if your application code needs to access a JAR, you can also memory-map it.
  • Per-thread data for two threads. The 1M block is the thread stack. I didn't have a good explanation for the 4k block, but @ericsoe identified it as a "guard block": it does not have read/write permissions, so will cause a segment fault if accessed, and the JVM catches that and translates it to a StackOverFlowError. For a real app, you will see dozens if not hundreds of these entries repeated through the memory map.
  • One of the shared libraries that holds the actual JVM code. There are several of these.
  • The shared library for the C standard library. This is just one of many things that the JVM loads that are not strictly part of Java.

The shared libraries are particularly interesting: each shared library has at least two segments: a read-only segment containing the library code, and a read-write segment that contains global per-process data for the library (I don't know what the segment with no permissions is; I've only seen it on x64 Linux). The read-only portion of the library can be shared between all processes that use the library; for example, libc has 1.5M of virtual memory space that can be shared.

When is Virtual Memory Size Important?

The virtual memory map contains a lot of stuff. Some of it is read-only, some of it is shared, and some of it is allocated but never touched (eg, almost all of the 4Gb of heap in this example). But the operating system is smart enough to only load what it needs, so the virtual memory size is largely irrelevant.

Where virtual memory size is important is if you're running on a 32-bit operating system, where you can only allocate 2Gb (or, in some cases, 3Gb) of process address space. In that case you're dealing with a scarce resource, and might have to make tradeoffs, such as reducing your heap size in order to memory-map a large file or create lots of threads.

But, given that 64-bit machines are ubiquitous, I don't think it will be long before Virtual Memory Size is a completely irrelevant statistic.

When is Resident Set Size Important?

Resident Set size is that portion of the virtual memory space that is actually in RAM. If your RSS grows to be a significant portion of your total physical memory, it might be time to start worrying. If your RSS grows to take up all your physical memory, and your system starts swapping, it's well past time to start worrying.

But RSS is also misleading, especially on a lightly loaded machine. The operating system doesn't expend a lot of effort to reclaiming the pages used by a process. There's little benefit to be gained by doing so, and the potential for an expensive page fault if the process touches the page in the future. As a result, the RSS statistic may include lots of pages that aren't in active use.

Bottom Line

Unless you're swapping, don't get overly concerned about what the various memory statistics are telling you. With the caveat that an ever-growing RSS may indicate some sort of memory leak.

With a Java program, it's far more important to pay attention to what's happening in the heap. The total amount of space consumed is important, and there are some steps that you can take to reduce that. More important is the amount of time that you spend in garbage collection, and which parts of the heap are getting collected.

Accessing the disk (ie, a database) is expensive, and memory is cheap. If you can trade one for the other, do so.

Stop before RAM-swap to disk

On Linux / MAC OSX:

installed.RAM <- as.numeric(system("awk '/MemTotal/ {print $2}' /proc/meminfo",intern=TRUE));
used.RAM <- installed.RAM - as.numeric(system("awk '/MemFree/ {print $2}' /proc/meminfo", intern=TRUE));

Can I prevent the gcc optimizer from delaying memory allocation?

You seem to be misinterpreting the situation. Virtual memory within a user-space process (heap space in this case) does get allocated “immediately” (possibly after a few system calls that negotiate a larger heap).

However, each page-aligned page-sized chunk of virtual memory that you haven’t touched yet will initially lack a physical page backing. Virtual pages are mapped to physical pages lazily, (only) when the need arises.

That said, the “allocation” you are observing (as part of the first access to the big heap space) is happening a few layers of abstraction below what GCC can directly influence and is handled by your operating system’s paging mechanism.

Side note: Another consequence would be, for example, that allocating a 1 TB chunk of virtual memory on a machine with, say, 128 GB of RAM will appear to work perfectly fine, as long as you never access most of that huge (lazily) allocated space. (There are configuration options that can limit such memory overcommitment if need be.)

When you touch your newly allocated virtual memory pages for the first time, each of them causes a page fault and your CPU ends up in a handler in the kernel because of that. The kernel evaluates the situation and establishes that the access was in fact legit. So it “materializes” the virtual memory page, i.e. picks a physical page to back the virtual page and updates both its bookkeeping data structures and (equally importantly) the hardware page mapping mechanism(s) (e.g. page tables or TLB, depending on architecture). Then the kernel switches back to your userspace process, which will have no clue that all of this just happened. Repeat for each page.

Presumably, the description above is hugely oversimplified. (For example, there can be multiple page sizes to strike a balance between mapping maintenance efficiency and granularity / fragmentation etc.)

A simple and ugly way to ensure that the memory buffer gets its hardware backing would be to find the smallest possible page size on your architecture (which would be 4 kiB on a x86_64, for example, so 1024 of those integers (well, in most cases)) and then touch each (possible) page of that memory beforehand, as in: for (size_t i = 0; i < 0x80000000; i += 1024) buffer[i] = 1;.

There are (of course) more reasonable solutions than that↑; this is just an example to illustrate what’s happening and why.

Out of memory on R using linux but not on Windows

With the help of a member from another forum (https://community.rstudio.com/t/out-of-memory-on-r-using-linux-but-not-on-windows/106549), I found the solution. The crash was a result of memory limitation in the swap partition, as speculated earlier. I increased my swap from 2 Gb to 16 Gb and now R/RStudio is able to complete the whole script. It is a quite demanding task since all of my physical memory is exhausted and nearly 15 Gb of the swap is eaten.

Limit memory usage for a single Linux process

There's some problems with ulimit. Here's a useful read on the topic: Limiting time and memory consumption of a program in Linux, which lead to the timeout tool, which lets you cage a process (and its forks) by time or memory consumption.

The timeout tool requires Perl 5+ and the /proc filesystem mounted. After that you copy the tool to e.g. /usr/local/bin like so:

curl https://raw.githubusercontent.com/pshved/timeout/master/timeout | \
sudo tee /usr/local/bin/timeout && sudo chmod 755 /usr/local/bin/timeout

After that, you can 'cage' your process by memory consumption as in your question like so:

timeout -m 500 pdftoppm Sample.pdf

Alternatively you could use -t <seconds> and -x <hertz> to respectively limit the process by time or CPU constraints.

The way this tool works is by checking multiple times per second if the spawned process has not oversubscribed its set boundaries. This means there actually is a small window where a process could potentially be oversubscribing before timeout notices and kills the process.

A more correct approach would hence likely involve cgroups, but that is much more involved to set up, even if you'd use Docker or runC, which among things, offer a more user-friendly abstraction around cgroups.



Related Topics



Leave a reply



Submit