What Is the Purpose of Map_Anonymous Flag in Mmap System Call

What is the purpose of MAP_ANONYMOUS flag in mmap system call?

Anonymous mappings can be pictured as a zeroized virtual file.
Anonymous mappings are simply large, zero-filled blocks of memory ready for use.
These mappings reside outside of the heap, thus do not contribute to data segment fragmentation.

MAP_ANONYMOUS + MAP_PRIVATE:

  • every call creates a distinct mapping
  • children inherit parent's mappings
  • childrens' writes on the inherited mapping are catered in copy-on-write manner
  • the main purpose of using this kind of mapping is to allocate a new zeroized memory
  • malloc employs anonymous private mappings to serve memory allocation requests larger than MMAP_THRESHOLD bytes.

    typically, MMAP_THRESHOLD is 128kB.

MAP_ANONYMOUS + MAP_SHARED:

  • each call creates a distinct mapping that doesn't share pages with any other mapping
  • children inherit parent's mappings
  • no copy-on-write when someone else sharing the mapping writes on the shared mapping
  • shared anonymous mappings allow IPC in a manner similar to System V memory segments, but only between related processes

On Linux, there are two ways to create anonymous mappings:

  • specify MAP_ANONYMOUS flag and pass -1 for fd

        addr = mmap(NULL, length, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0); 
    if (addr == MAP_FAILED)
    exit(EXIT_FAILURE);
  • open /dev/zero and pass this opened fd

        fd = open("/dev/zero", O_RDWR);   
    addr = mmap(NULL, length, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);

    (this method is typically used on systems like BSD, that do not have MAP_ANONYMOUS flag)

Advantages of anonymous mappings:

- no virtual address space fragmentation; after unmapping, the memory is immediately returned to the system

- they are modifiable in terms of allocation size, permissions and they can also receive advice just like normal mappings

- each allocation is a distinct mapping, separate from global heap

Disadvantages of anonymous mappings:

- size of each mapping is an integer multiple of system's page size, thus it can lead to wastage of address space

- creating and returning mappings incur more overhead than that of from the pre-allocated heap

if a program containing such mapping, forks a process, the child inherits the mapping.
The following program demonstrates this kinda inheritance:

#ifdef USE_MAP_ANON
#define _BSD_SOURCE
#endif
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/wait.h>
#include <sys/mman.h>
#include <fcntl.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
/*Pointer to shared memory region*/
int *addr;

#ifdef USE_MAP_ANON /*Use MAP_ANONYMOUS*/
addr = mmap(NULL, sizeof(int), PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0);
if (addr == MAP_FAILED) {
fprintf(stderr, "mmap() failed\n");
exit(EXIT_FAILURE);
}

#else /*Map /dev/zero*/
int fd;
fd = open("/dev/zero", O_RDWR);
if (fd == -1) {
fprintf(stderr, "open() failed\n");
exit(EXIT_FAILURE);
}

addr = mmap(NULL, sizeof(int), PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
if (addr == MAP_FAILED) {
fprintf(stderr, "mmap() failed\n");
exit(EXIT_FAILURE);
}

if (close(fd) == -1) { /*No longer needed*/
fprintf(stderr, "close() failed\n");
exit(EXIT_FAILURE);
}
#endif
*addr = 1; /*Initialize integer in mapped region*/

switch(fork()) { /*Parent and child share mapping*/
case -1:
fprintf(stderr, "fork() failed\n");
exit(EXIT_FAILURE);

case 0: /*Child: increment shared integer and exit*/
printf("Child started, value = %d\n", *addr);
(*addr)++;

if (munmap(addr, sizeof(int)) == -1) {
fprintf(stderr, "munmap()() failed\n");
exit(EXIT_FAILURE);
}
exit(EXIT_SUCCESS);

default: /*Parent: wait for child to terminate*/
if (wait(NULL) == -1) {
fprintf(stderr, "wait() failed\n");
exit(EXIT_FAILURE);
}

printf("In parent, value = %d\n", *addr);
if (munmap(addr, sizeof(int)) == -1) {
fprintf(stderr, "munmap()() failed\n");
exit(EXIT_FAILURE);
}
exit(EXIT_SUCCESS);
}

Sources:

The Linux Programming Interface

Chapter 49: Memory Mappings,

Author: Michael Kerrisk

Linux System Programming (3rd edition)

Chapter 8: Memory Management,

Author: Robert Love

mmap File-backed mapping vs Anonymous mapping in Linux

mmap() system call allows you to go for either file-backed mapping or anonymous mapping.

void *mmap(void *addr, size_t lengthint " prot ", int " flags ,int fd,
off_t offset)

File-backed mapping- In linux , there exists a file /dev/zero which is an infinite source of 0 bytes. You just open this file, and pass its descriptor to the mmap() call with appropriate flag, i.e., MAP_SHARED if you want the memory to be shared by other process or MAP_PRIVATE if you don't want sharing.

Ex-

     .
.
if ((fd = open("/dev/zero", O_RDWR)) < 0)
printf("open error");
if ((area = mmap(0, SIZE, PROT_READ | PROT_WRITE, MAP_SHARED,fd, 0)) == MAP_FAILED)
{
printf("Error in memory mapping");
exit(1);
}
close(fd); //close the file because memory is mapped
//create child process
.
.

Quoting the man-page of mmap() :-

The contents of a file mapping (as opposed to an anonymous mapping;
see MAP_ANONYMOUS below), are initialized using length bytes starting
at offset offset in the file (or other object) referred to by the file
descriptor fd. offset must be a multiple of the page size as returned
by sysconf(_SC_PAGE_SIZE).

In our case, it has been initialized with zeroes(0s).

Quoting the text from the book Advanced Programming in the UNIX Environment by W. Richard Stevens, Stephen A. Rago II Edition

The advantage of using /dev/zero in the manner that we've shown is
that an actual file need not exist before we call mmap to create the
mapped region. Mapping /dev/zero automatically creates a mapped region
of the specified size. The disadvantage of this technique is that it
works only between related processes. With related processes, however,
it is probably simpler and more efficient to use threads (Chapters 11
and 12). Note that regardless of which technique is used, we still
need to synchronize access to the shared data

After the call to mmap() succeeds, we create a child process which will be able to see the writes to the mapped region(as we specified MAP_SHARED flag).

Anonymous mapping - The similar thing that we did above can be done using anonymous mapping.For anonymous mapping, we specify the MAP_ANON flag to mmap and specify the file descriptor as -1.
The resulting region is anonymous (since it's not associated with a pathname through a file descriptor) and creates a memory region that can be shared with descendant processes.
The advantage is that we don't need any file for mapping the memory, the overhead of opening and closing file is also avoided.

if ((area = mmap(0, SIZE, PROT_READ | PROT_WRITE, MAP_ANON | MAP_SHARED, -1, 0)) == MAP_FAILED)
printf("Error in anonymous memory mapping");

So, these file-backed mapping and anonymous mapping necessarily work only with related processes.

If you need this between unrelated processes, then you probably need to create named shared memory by using shm_open() and then you can pass the returned file descriptor to mmap().

mmap: map_anonymous why does it give SIGSEGV?

Check the return values of your system calls!

The flags argument to mmap must have exactly one of these two options:

MAP_SHARED
Share this mapping. Updates to the mapping are visible to other processes
that map this file, and are carried through to the underlying file. The file
may not actually be updated until msync(2) or munmap() is called.

MAP_PRIVATE
Create a private copy-on-write mapping. Updates to the mapping are not
visible to other processes mapping the same file, and are not carried through
to the underlying file. It is unspecified whether changes made to the file
after the mmap() call are visible in the mapped region.

You're not providing that, so mmap is most likely failing (returning (void*)-1) with errno set to EINVAL.

Memory does not get allocated with the MAP_ANONYMOUS and MAP_SHARED_VALIDATE flag in mmap()

Looking at do_mmap in linux/mm/mmap.c (kernel version 5.9), MAP_SHARED_VALIDATE only seems to be supported for file-backed mappings (see the if (file) and else sections). I do not know if that is a bug or if it is intentional.

EDIT: I have submitted a bug report.

Why do we need MAP_PRIVATE flag when mapping memory?

You don't need MAP_PRIVATE, you need one of MAP_PRIVATE or MAP_SHARED.

The flags argument determines whether updates to the mapping are
visible to other processes mapping the same region, and whether
updates are carried through to the underlying file. This behavior is
determined by including exactly one of the following values in flags:

     MAP_SHARED

                Share this mapping. [...]

     MAP_PRIVATE

                Create a private copy-on-write mapping. [...]


mmap lets you choose how to propagate any change made to the mapped region:

  • MAP_PRIVATE backed up by a file

    No updates are visible to other processes mapping the same file.

    No updates are written to the backing file.

    Updates are made to a COW page.

    Useful to process the content of a file in-place.

  • MAP_PRIVATE | MAP_ANONYMOUS (e.g. not backed up by a file)

    There is no file to update.

    Updates are made to a COW page.

    Useful to allocate memory, not shared with forked processes.

  • MAP_SHARED backed up by file

    Updates are visible to other processes.

    Updates are propagated to the backing file.

    Useful to transform a file.

    Useful to share a memory area with other processes using a name (see shm_open).

  • MAP_SHARED | MAP_ANONYMOUS (e.g. not backed up by a file)

    Updates are visible to all the processes with the same region mapped.

    There is no file to update.

    Useful to share an internal memory area with forked processes.

What are memory mapped page and anonymous page?

The correct terms are memory mapped files and anonymous mappings. When referring to memory mapping, one is usually referring to mmap(2). There are 2 categories for using mmap. One category is SHARED vs PRIVATE mappings. The other category is FILE vs ANONYMOUS mappings. Mixed together you get the 4 following combinations:

  1. PRIVATE FILE MAPPING
  2. SHARED FILE MAPPING
  3. PRIVATE ANONYMOUS MAPPING
  4. SHARED ANONYMOUS MAPPING

A File Mapping specifies a file, on disk, that will have N many bytes mapped into memory. The function mmap(2) takes, as its 4th argument a file descriptor to the file to be mapped into memory. The 5th argument is the number of bytes to be read in, as an offset. The typical process of using mmap to create a memory mapped file goes

  1. open(2) file to get a file descriptor.
  2. fstat(2) the file to get the size from the file descriptor data structure.
  3. mmap(2) the file using the file descriptor returned from open(2).
  4. close(2) the file descriptor.
  5. do whatever to the memory mapped file.

When a file is mapped in as PRIVATE, changes made are not committed to the underlying file. It is a PRIVATE, in-memory copy of the file. When a file is mapped SHARED, changes made are committed to the underlying file by the kernel automatically. Files mapped in as shared can be used for what is called Memory Mapped I/O, and IPC. You would use a memory mapped file for IPC instead of a shared memory segment if you need the persistence of the file

If you use strace(1) to watch a process initialize, you will notice that the different sections of the file are mapped in using mmap(2) as private file mappings. The same is true for system libs.

Examples of output from strace(1) where mmap(2) is being used to map in libraries to the process.

open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=42238, ...}) = 0
mmap(NULL, 42238, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7ff7ca71e000
close(3) = 0
open("/lib64/libc.so.6", O_RDONLY) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0p\356\341n8\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1926760, ...}) = 0
mmap(0x386ee00000, 3750152, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x386ee00000

Anonymous mappings are not backed by a file. To be specific, the 4th (file descriptor) and 5th (offset) argument of mmap(2) are not even used when the MAP_ANONYMOUS flag is used as the 3rd argument to mmap(2). An alternative to using the MAP_ANONYMOUS flag is to use /dev/zero as the file.

The word 'Anonymous' is, to me, a poor choice in that it sounds as if the file is mapped anonymously. Instead, it is the file that is anonymous, ie. there isn't a file specified.

Uses for private anonymous mappings are few in user land programming. You could use a shared anonymous mapping so that applications could share a region of memory, but I do not know the reason why you wouldn't use SYSV or POSIX shared memory instead.

Since memory mapped in using Anonymous mappings is guaranteed to be zero filled, it could be useful for some applications that expect/require zero filled regions of memory to use mmap(2) in this way instead of the malloc(2) + memset(2) combo.

Transparent Huge Page support in Linux

Anonymous memory mapping is a memory mapping that isn't associated with a file. See What is the purpose of MAP_ANONYMOUS flag in mmap system call? for more details about it.

Anonymous mappings are often used to implement the heap and stack used by application languages. So by enabling THP for anonymous mappings, it allows for very large heaps, which allows applications to process huge amounts of data.

Most applications don't use memory mapping to access files, they use system calls like open, read, and write. So there's less need to use huge pages with mapped files, and they haven't implemented this.

Why are mmap syscall flags not set

Kernel syscall interface on AMD64 uses r10 register as a fourth argument, not rcx.

mov $0x32, %r10

See linked question for more details.

Understanding memory allocations

To use mmap (MAP_ANONYMOUS) or malloc changes nothing in your case, if you dont have enough free memory mmap returns MAP_FAILED and malloc returns NULL

If I use that program :

#include <sys/mman.h>
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char ** argv)
{
int n = atoi(argv[1]);
void * m;

if (argc == 1) {
m = mmap(NULL, n*1024*1024, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0);

if (m == MAP_FAILED) {
puts("ko");
return 0;
}
}
else {
m = malloc(n*1024*1024);
if (m == 0) {
puts("ko");
return 0;
}
}

puts("ok");
getchar();

char * p = (char *) m;
char * sup = p + n*1024*1024;

while (p < sup) {
*p = 0;
p += 512;
}

puts("done");
getchar();

return 0;
}

I am on a raspberrypi with 1Gb of memory and a swap of 100Mo, the memory is already used by chromium because I am on SO

proc/meminfo gives :

MemTotal:         949448 kB
MemFree: 295008 kB
MemAvailable: 633560 kB
Buffers: 39296 kB
Cached: 360372 kB
SwapCached: 0 kB
Active: 350416 kB
Inactive: 260960 kB
Active(anon): 191976 kB
Inactive(anon): 41908 kB
Active(file): 158440 kB
Inactive(file): 219052 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 102396 kB
SwapFree: 102396 kB
Dirty: 352 kB
Writeback: 0 kB
AnonPages: 211704 kB
Mapped: 215924 kB
Shmem: 42304 kB
Slab: 24528 kB
SReclaimable: 12108 kB
SUnreclaim: 12420 kB
KernelStack: 2128 kB
PageTables: 5676 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 577120 kB
Committed_AS: 1675164 kB
VmallocTotal: 1114112 kB
VmallocUsed: 0 kB
VmallocChunk: 0 kB
CmaTotal: 8192 kB
CmaFree: 6796 kB

If I do that :

pi@raspberrypi:/tmp $ ./a.out 750
ko

750 is to large, but

pi@raspberrypi:/tmp $ ./a.out 600 &
[1] 1525
pi@raspberrypi:/tmp $ ok

The used memory (top etc) doesn't reflect the 600Mo because I do not read/write in them

proc/meminfo gives :

MemTotal:         949448 kB
MemFree: 282860 kB
MemAvailable: 626016 kB
Buffers: 39432 kB
Cached: 362860 kB
SwapCached: 0 kB
Active: 362696 kB
Inactive: 260580 kB
Active(anon): 199880 kB
Inactive(anon): 41392 kB
Active(file): 162816 kB
Inactive(file): 219188 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 102396 kB
SwapFree: 102396 kB
Dirty: 624 kB
Writeback: 0 kB
AnonPages: 220988 kB
Mapped: 215672 kB
Shmem: 41788 kB
Slab: 24788 kB
SReclaimable: 12296 kB
SUnreclaim: 12492 kB
KernelStack: 2136 kB
PageTables: 5692 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 577120 kB
Committed_AS: 2288564 kB
VmallocTotal: 1114112 kB
VmallocUsed: 0 kB
VmallocChunk: 0 kB
CmaTotal: 8192 kB
CmaFree: 6796 kB

And I can again do

pi@raspberrypi:/tmp $ ./a.out 600 &
[2] 7088
pi@raspberrypi:/tmp $ ok

pi@raspberrypi:/tmp $ jobs
[1]- stopped ./a.out 600
[2]+ stopped ./a.out 600
pi@raspberrypi:/tmp $

Even the total is too large for the memory + swap, /proc/meminfo gives :

MemTotal:         949448 kB
MemFree: 282532 kB
MemAvailable: 626112 kB
Buffers: 39432 kB
Cached: 359980 kB
SwapCached: 0 kB
Active: 365200 kB
Inactive: 257736 kB
Active(anon): 202280 kB
Inactive(anon): 38320 kB
Active(file): 162920 kB
Inactive(file): 219416 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 102396 kB
SwapFree: 102396 kB
Dirty: 52 kB
Writeback: 0 kB
AnonPages: 223520 kB
Mapped: 212600 kB
Shmem: 38716 kB
Slab: 24956 kB
SReclaimable: 12476 kB
SUnreclaim: 12480 kB
KernelStack: 2120 kB
PageTables: 5736 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 577120 kB
Committed_AS: 2876612 kB
VmallocTotal: 1114112 kB
VmallocUsed: 0 kB
VmallocChunk: 0 kB
CmaTotal: 8192 kB
CmaFree: 6796 kB

If I write in the memory of %1 then stop it I have a lot of swap done on the flash

pi@raspberrypi:/tmp $ %1
./a.out 600

done
^Z
[1]+ stopped ./a.out 600

now there is almost no free swap and almost no free memory, /proc/meminfo gives

MemTotal:         949448 kB
MemFree: 33884 kB
MemAvailable: 32544 kB
Buffers: 796 kB
Cached: 66032 kB
SwapCached: 66608 kB
Active: 483668 kB
Inactive: 390360 kB
Active(anon): 462456 kB
Inactive(anon): 374188 kB
Active(file): 21212 kB
Inactive(file): 16172 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 102396 kB
SwapFree: 3080 kB
Dirty: 96 kB
Writeback: 0 kB
AnonPages: 740984 kB
Mapped: 61176 kB
Shmem: 29288 kB
Slab: 21932 kB
SReclaimable: 9084 kB
SUnreclaim: 12848 kB
KernelStack: 2064 kB
PageTables: 7012 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 577120 kB
Committed_AS: 2873112 kB
VmallocTotal: 1114112 kB
VmallocUsed: 0 kB
VmallocChunk: 0 kB
CmaTotal: 8192 kB
CmaFree: 6796 kB

%1 is still waiting on the getchar, if I do the same for %2 it works but in fact because the process %1 disappear (without message on the shell)

The behavior is the same if I malloc (giving a second argument to the program)


See also What is the purpose of MAP_ANONYMOUS flag in mmap system call?

mmap system call usage

Based on the clarification you have provided in your comments, it sounds like you are trying to mis two things that have nothing to do with each other.

void *part1 = malloc(100);
void *part2 = malloc(250);

You want to manipulate virtual memory so that these two blocks of memory are addressable as 350 contiguous bytes of memory.

This is not possible. First of all, the blocks of memory you have will in general be neither page-aligned nor page-sized. You can only manipulate virtual memory in page-aligned, page-sized chunks. Secondly, even if you are very lucky and they are page-aligned and page-sized, they probably come from the heap area (the area below brk()). I don't think you can remap or unmap that area of memory using mremap() or munmap(). (There are alternate implementations of malloc() that get memory from mmap() and wouldn't be subject to this problem but they are still subject to the first problem.

But let's say you do have two blocks of memory that are page-aligned, page-sized, and remapable, and you want to remap them so that they are adjacent. Most likely, you obtained those blocks from mmap() in the first place. Then you could remap them to adjacent addresses using mremap(). Be aware that mremap() is Linux-specific though. I'm not aware of a portable way to do this. In pseudocode:

/* Map some useless memory just to get the kernel to reserve a range
of addresses for us which will be big enough for both blocks */
address = mmap(NULL, blocksize1+blocksize2, ..., MAP_ANONYMOUS, ...);

/* remap the first block to the the first address in this new range */
mremap(block1, blocksize1, blocksize1, MREMAP_MAYMOVE|MREMAP_FIXED, address);
/* remap the second block to go right after the first block */
mremap(block2, blocksize2, blocksize2, MREMAP_MAYMOVE|MREMAP_FIXED,
address+blocksize1);


Related Topics



Leave a reply



Submit