Why Does the Free() Function Not Return Memory to the Operating System

Why does the free() function not return memory to the operating system?

Memory is allocated onto a heap.

When you request some memory in your program (with a new() or malloc() etc.) Your program requests some memory from its heap, which in turn requests it from the operating system{1}. Since this is an expensive operation, it gets a chunk of memory from the OS, not just what you ask for. The memory manager puts everything it gets into the heap, just returning to you the perhaps small amount you asked for. When you free() or delete() this memory, it simply gets returned to the heap, not the OS.

It's absolutely normal for that memory to not be returned to the operating system until your program exits, as you may request further memory later on.

If your program design relies on this memory be recycled, it may be achievable using multiple copies of your program (by fork()~ing) which run and exit.

{1} The heap is probably non-empty on program start, but assuming it's not illustrates my point.

Force free() to return malloc memory back to OS

With glibc malloc try to call malloc_trim function. It is not well documented and there were changes inside it at around 2007 (glibc 2.9) - https://stackoverflow.com/a/42281428.

Since 2007 this function will: Iterate over all malloc memory arenas (used in multithreaded applications) doing trim and fastbin consolidation; and release all aligned (4KB) pages fully freed.

https://sourceware.org/git/?p=glibc.git;a=commit;f=malloc/malloc.c;h=68631c8eb92ff38d9da1ae34f6aa048539b199cc

Ulrich Drepper
Sun, 16 Dec 2007 22:53:08 +0000 (22:53 +0000)

malloc/malloc.c (public_mTRIm): Iterate over all arenas and call mTRIm for all of them.

(mTRIm): Additionally iterate over all free blocks and use madvise
to free memory for all those blocks which contain at least one
memory page.

https://sourceware.org/git/?p=glibc.git;a=blobdiff;f=malloc/malloc.c;h=c54c203cbf1f024e72493546221305b4fd5729b7;hp=1e716089a2b976d120c304ad75dd95c63737ad75;hb=68631c8eb92ff38d9da1ae34f6aa048539b199cc;hpb=52386be756e113f20502f181d780aecc38cbb66a

+  malloc_consolidate (av);
...
+  for (int i = 1; i < NBINS; ++i)
...
+        for (mchunkptr p = last (bin); p != bin; p = p->bk)
+         {
...
+               /* See whether the chunk contains at least one unused page.  */
+               char *paligned_mem = (char *) (((uintptr_t) p
+                                               + sizeof (struct malloc_chunk)
+                                               + psm1) & ~psm1);
...
+               /* This is the size we could potentially free.  */
+               size -= paligned_mem - (char *) p;
+
+               if (size > psm1)
+                 {
...
+                   madvise (paligned_mem, size & ~psm1, MADV_DONTNEED);

So, calling malloc_trim will release almost all freed memory back to the OS. Only pages containing still not freed data will be kept; OS may unmap or not unmap physical page when madvised with MADV_DONTNEED and linux usually does unmap. madvised pages are still count to VSIZE (total virtual memory size of the process), but usually help to reduce RSS (amount of physical memory used by process).

Alternatively, you can try to switch into alternative malloc library: tcmalloc (gperftools / google-perftools) or jemalloc (facebook), both of them have aggressive rules of returning freed memory back to OS (with madvise MADV_DONTNEED or even MADV_FREE).

Will malloc implementations return free-ed memory back to the system?

The following analysis applies only to glibc (based on the ptmalloc2 algorithm).
There are certain options that seem helpful to return the freed memory back to the system:

mallopt() (defined in malloc.h) does provide an option to set the trim threshold value using one of the parameter option M_TRIM_THRESHOLD, this indicates the minimum amount of free memory (in bytes) allowed at the top of the data segment. If the amount falls below this threshold, glibc invokes brk() to give back memory to the kernel.
The default value of M_TRIM_THRESHOLD in Linux is set to 128K, setting a smaller value might save space.
The same behavior could be achieved by setting trim threshold value in the environment variable MALLOC_TRIM_THRESHOLD_, with no source changes absolutely.
However, preliminary test programs run using M_TRIM_THRESHOLD has shown that even though the memory allocated by malloc does return to the system, the remaining portion of the actual chunk of memory (the arena) initially requested via brk() tends to be retained.
It is possible to trim the memory arena and give any unused memory back to the system by calling malloc_trim(pad) (defined in malloc.h). This function resizes the data segment, leaving at least pad bytes at the end of it and failing if less than one page worth of bytes can be freed. Segment size is always a multiple of one page, which is 4,096 bytes on i386.
The implementation of this modified behavior of free() using malloc_trim could be done using the malloc hook functionality. This would not require any source code changes to the core glibc library.
Using madvise() system call inside the free implementation of glibc.

free() returns memory to the OS

I did some experiments, read a chapter of The Linux Programming Interface and get an satisfying answer for myself.

First , the conclusion I have is:

Library call malloc uses system calls brk and mmap under the hood when allocating memory.
As @John Zwinck describs, a linux process would choose to use brk or mmap allocating mem depending on how much you request.
If allocating by brk, the process is probably not returning the memory to the OS before it terminates (sometimes it does). If by mmap, for my simple test the process returns the mem to OS before it terminates.

Experiment code (examine memory stats in htop at the same time):

code sample 1

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <stdint.h>

#define BUFSIZE 1073741824 //1GiB

// run `ulimit -s unlimited` first

int main(){
    printf("start\n");
    printf("%lu \n", sizeof(uint32_t));
    uint32_t* p_arr[BUFSIZE / 4]; 
    sleep(10); 
    for(size_t i = 0; i < (BUFSIZE / 4); i++){
        uint32_t* p = (uint32_t*)malloc(sizeof(uint32_t));
        if (p == NULL){
            printf("alloc failed\n");
            exit(1);
        }
        p_arr[i] = p;
    } 
    printf("alloc done\n"); 
    for(size_t i = 0; i < (BUFSIZE / 4); i++){
        free(p_arr[i]);
    }
    
    printf("free done\n");
    sleep(20);
    printf("exit\n");
}

When it comes to "free done\n", and sleep(), you can see that the program still takes up the memory and doesn't return to the OS. And strace ./a.out showing brk gets called many times.

Note:

I am looping malloc to allocate memory. I expected it to take up only 1GiB ram but in fact it takes up 8GiB ram in total. malloc adds some extra bytes for bookeeping or whatever else. One should never allocate 1GiB in this way, in a loop like this.

code sample 2:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <stdint.h>

#define BUFSIZE 1073741824 //1GiB

int main(){
    printf("start\n");
    printf("%lu \n", sizeof(uint32_t));
    uint32_t* p_arr[BUFSIZE / 4]; 
    sleep(3); 
    for(size_t i = 0; i < (BUFSIZE / 4); i++){
        uint32_t* p = (uint32_t*)malloc(sizeof(uint32_t));
        if (p == NULL){
            printf("alloc failed\n");
            exit(1);
        }
        p_arr[i] = p;
    } 
    printf("%p\n", p_arr[0]);
    printf("alloc done\n"); 
    for(size_t i = 0; i < (BUFSIZE / 4); i++){
        free(p_arr[i]);
    }
    printf("free done\n");
    printf("allocate again\n");
    sleep(10);
    for(size_t i = 0; i < (BUFSIZE / 4); i++){
        uint32_t* p = malloc(sizeof(uint32_t));
        if (p == NULL){
            PFATAL("alloc failed\n");
        }
        p_arr[i] = p;
    } 
    printf("allocate again done\n");
    sleep(10);
    for(size_t i = 0; i < (BUFSIZE / 4); i++){
        free(p_arr[i]);
    }
    printf("%p\n", p_arr[0]);
    sleep(3);
    printf("exit\n");
}

This one is similar to sample 1, but it allocate again after free. The scecond allocation doesn't increase memory usage, it uses the freed yet not returned mem again.

code sample 3:

#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <assert.h>

#define MAX_ALLOCS 1000000

int main(int argc, char* argv[]){
    int freeStep, freeMin, freeMax, blockSize, numAllocs, j;
    char* ptr[MAX_ALLOCS];
    printf("\n");
    numAllocs = atoi(argv[1]);
    blockSize = atoi(argv[2]);
    freeStep = (argc > 3) ? atoi(argv[3]) : 1;
    freeMin = (argc > 4) ? atoi(argv[4]) : 1;
    freeMax = (argc > 5) ? atoi(argv[5]) : numAllocs;
    assert(freeMax <= numAllocs);

    printf("Initial program break:   %10p\n", sbrk(0));
    printf("Allocating %d*%d bytes\n", numAllocs, blockSize);
    for(j = 0; j < numAllocs; j++){
        ptr[j] = malloc(blockSize);
        if(ptr[j] == NULL){
            perror("malloc return NULL");
            exit(EXIT_FAILURE);
        }
    }

    printf("Program break is now:    %10p\n", sbrk(0));
    printf("Freeing blocks from %d to %d in steps of %d\n", freeMin, freeMax, freeStep);
    for(j = freeMin - 1; j < freeMax; j += freeStep){
        free(ptr[j]);
    }
    printf("After free(), program break is : %10p\n", sbrk(0));
    printf("\n");
    exit(EXIT_SUCCESS);
}

This one takes from The Linux Programming Interface and I simplifiy a bit.

Chapter 7:

The first two command-line arguments specify the number and size of
blocks to allocate. The third command-line argument specifies the loop
step unit to be used when freeing memory blocks. If we specify 1 here
(which is also the default if this argument is omitted), then the
program frees every memory block; if 2, then every second allocated
block; and so on. The fourth and fifth command-line arguments specify
the range of blocks that we wish to free. If these arguments are
omitted, then all allocated blocks (in steps given by the third
command-line argument) are freed.

Try run with:

./free_and_sbrk 1000 10240 2
./free_and_sbrk 1000 10240 1 1 999
./free_and_sbrk 1000 10240 1 500 1000

you will see only for the last example, the program break decreases, aka, the process returns some blocks of mem to OS (if I understand correctly).

This sample code is evidence of

"If allocating by brk, the process is probably not returning the memory to the OS before it terminates (sometimes it does)."

At last, quotes some useful paragraph from the book. I suggest reading Chapter 7 (section 7.1) of TLPI, very helpful.

In general, free() doesn’t lower the program break, but instead adds
the block of memory to a list of free blocks that are recycled by
future calls to malloc(). This is done for several reasons:
The block of memory being freed is typically somewhere in the middle of
the heap, rather than at the end, so that lowering the program break
is not possible.
It minimizes the number of sbrk() calls that the
program must perform. (As noted in Section 3.1, system calls have a
small but significant overhead.)
In many cases, lowering the break
would not help programs that allocate large amounts of memory, since
they typically tend to hold on to allocated memory or repeatedly
release and reallocate memory, rather than release it all and then
continue to run for an extended period of time.

What is program break (also from the book):

Sample Image

Also: https://www.wikiwand.com/en/Data_segment

What is Rust strategy to uncommit and return memory to the operating system?

By default Rust uses the system allocator.

This is based on malloc on Unix platforms and HeapAlloc on Windows, plus related functions.

Whether calling free() actually makes the memory available to other processes depends on the libc implementation and your operating system, and that question is mostly unrelated to Rust (see the links below). In any case, memory that was freed should be available for future allocations, so long-running processes don't leak memory.

My general experience is that resource consumption of Rust servers is very low.

What operating systems won't free memory on program exit?

Short answer is "none". Even a program on DOS years ago would release memory on program termination (simply by the virtue that nothing was managing the memory when the program stopped). I'm sure someone might sight that kernel mode code doesn't necessarily free its memory on app exit or they may cite some obscure embedded os.... but you can assume that app-exit returns all the memory your user mode code acquired. (Windows 3.x might have had this problem depending on which allocator was used...)

The reason for the virtue that you "should free your memory" is that for large scale software engineering, you should strive to develop components that are flexible in their use because you never know how someone else is going to change the use of your code long after you've left the team.

Think of it like this. Let's say you design some class that is designed to be a singleton (only instantiated once during the app lifetime). As such, you decide not to bother with memory cleanup when your component destructs or gets finalized. That's a perfectly fine decision for that moment. Years later, after you've left for greener pastures, someone else may come along and decide that they need to use your class in multiple places such that many instances will come and go during the app lifetime. Your memory leak will become their problem.

On my team, we've often talked about making the user initiated "close" of the application just be exit() without doing any cleanup. If we ever do this, I would still enforce that the team develop classes and components that properly cleanup after themselves.

Does terminating a program reclaim memory in the same way as free()?

No, terminating a program, as with exit or abort, does not reclaim memory in the same way as free. Using free causes some activity that ultimately has no effect when the operating system discards the data maintained by malloc and free.

exit has some complications, as it does not immediately terminate the program. For now, let’s just consider the effect of immediately terminating the program and consider the complications later.

In a general-purpose multi-user operating system, when a process is terminated, the operating system releases the memory it was using for other purposes.¹ In large part, this simply means the operating system does some accounting operations.

In contrast, when you call free, software inside the program runs, and it has to look up the size of the memory you are freeing and then insert information about that memory into the pool of memory it is maintaining. There could be thousands or tens of thousands (or more) of such allocations. A program that frees all its data may have to execute many thousands of calls to free. Yet, in the end, when the program exits, all of the changes produced by free will vanish, as the operating system will discard all the data about that pool of memory—all of the data is in memory pages the operating system does not preserve.

So, in this regard, the answer you link to is correct, calling free is a waste. And, as it points out, the necessity of going through all the data structures in the program to fetch the pointers in them so the memory they point to can be freed causes all those data structures to be read into memory if they had been swapped out to disk. For large programs, it can take a considerable amount of time and other resources.

On the other hand, it is not clear it is easy to avoid many calls to free. This is because releasing memory is not the only thing a terminating program has to clean up. A program may want to write final data to files or send final messages to network connections. Furthermore, a program may not have established all of this context directly. Most large programs rely on layers of software, and each software package may have set up its own context, and often no way is provided to tell other software “I want to exit now. Finish the valuable context, but skip all the freeing of memory.” So all the desired clean-up tasks may be interwined with the free-memory tasks, and there may be no good way to untangle them.

Software should generally be written so that nothing terrible happens if a program is suddenly aborted (since this can happen from a loss of power, not just deliberate user action). But even though a program might be able to tolerate an abort, there can still be value in a graceful exit.

Getting back to exit, calling the C exit routine does not exit the program immediately. Exit handlers (registered with atexit) are called, stream buffers are flushed, and streams are closed. Any software libraries you called may have set up their own exit handlers so that they can finish up when the program is exiting. So, if you want to be sure libraries you have used in your program are not calling free when you end the program, you have to call abort, not exit. But it is generally preferred to end a program gracefully, not by aborting. Calling abort will not call exit handlers, flush streams, close streams, or perform other wind-down code that exit does—data can be lost when a program calls abort.

Footnote

¹ Releasing memory does not mean it is immediately available for other purposes. The specific result of this depends on each page of memory. For example:

If the memory is shared with other processes, it is still needed for them, so releasing it from use by this process only decrements the number of processes using the memory. It is not immediately available for any other use.
If the memory is not in use by any other processes but contains data mapped from a file on disk, the operating system might mark it as available when needed but leave it alone for the moment. This is because you might run the same program again, and it would be nice if the data were still in memory, so why not just leave it in place just in case? The data might even be used by a different program that uses the same file. (For example, many programs might use the same shared library.)
If the memory is not in use by any other processes and was just used by the program as a work area, not mapped from a file, then system may mark it as immediately available and not containing anything useful.

Why Does the Free() Function Not Return Memory to the Operating System