Understanding Linux /proc/pid/maps or /proc/self/maps
Each row in /proc/$PID/maps
describes a region of contiguous virtual memory in a process or thread. Each row has the following fields:
address perms offset dev inode pathname
08048000-08056000 r-xp 00000000 03:0c 64593 /usr/sbin/gpm
- address - This is the starting and ending address of the region in the process's address space
- permissions - This describes how pages in the region can be accessed. There are four different permissions: read, write, execute, and shared. If read/write/execute are disabled, a
-
will appear instead of ther
/w
/x
. If a region is not shared, it is private, so ap
will appear instead of ans
. If the process attempts to access memory in a way that is not permitted, a segmentation fault is generated. Permissions can be changed using themprotect
system call. - offset - If the region was mapped from a file (using
mmap
), this is the offset in the file where the mapping begins. If the memory was not mapped from a file, it's just 0. - device - If the region was mapped from a file, this is the major and minor device number (in hex) where the file lives.
- inode - If the region was mapped from a file, this is the file number.
- pathname - If the region was mapped from a file, this is the name of the file. This field is blank for anonymous mapped regions. There are also special regions with names like
[heap]
,[stack]
, or[vdso]
.[vdso]
stands for virtual dynamic shared object. It's used by system calls to switch to kernel mode. Here's a good article about it: "What is linux-gate.so.1?"
You might notice a lot of anonymous regions. These are usually created by mmap
but are not attached to any file. They are used for a lot of miscellaneous things like shared memory or buffers not allocated on the heap. For instance, I think the pthread library uses anonymous mapped regions as stacks for new threads.
Reading /proc/PID/maps of short-lived process
You solution cat /proc/$(<pipeline>)/maps
won't call cat
until the whole <pipeline>
has ended (even if you use &
within it), so you will never get your maps
.
On the other hand,
<pipeline> & cat /proc/${!}/maps
will return immediately. Of course, as your binary exits immediately, cat
may still run too late to capture /proc/${!}/maps
But you could try:
while true; do; ./binary [input] & cat /proc/${!}/maps ; done
This restarts a race between your binary
and cat
all day long, and sometimes cat
may win (it does, in my case, with ls <nonexisting file>
instead of ./binary
1 out of 30 times)
This prints a lot of garbage on your terminal, but you can collect the succesful maps by redirecting your cat
s output:
while true; do; ./binary [input] & cat /proc/${!}/maps >> mymaps; done
Elegant? No. Effective: I hope so!
What is a proc map?
The file /proc/[pid]/maps
is a way to see the memory regions mapped by a process. Read about /proc
for more info on other useful stuff you can find there.
How does the Linux kernel create the /proc/$pid/maps file?
You did something weird with the link.
Clicking through few definitions reveals the file is generated on demand here:
https://github.com/torvalds/linux/blob/bcf876870b95592b52519ed4aafcf9d95999bc9c/fs/proc/task_mmu.c#L271
(at least for the common mmu case)
the usual question: why are you asking?
Why I can see the several same segments in the /proc/pid/maps output?
Please mind the values in columns 3 (starting offset) and 2 (permissions). Really you have the same part mapped twice, in lines 1 and 2 for your binary file, but, in line 3, it's different. It's permitted to map the same file separately multiple times; different systems could skip merging this into one VM map entry, so it could reflect mapping history but not the current state jist.
If you see at library mappings you could easily find the law that any library is mapped separately:
- With permission to read and execute: the main code which shouldn't be changed.
- With permission to read: constant data area without code allowed.
- With permission to read and write: it combines non-constant data area and relocation tables of shared objects.
Having the same starting 4K binary file area mapped twice could be explained with RTLD logic which differs from an arbitrary library logic due to bootstrapping needs. I don't treat it so important, more so it could easily differ on platform specifics.
Scattered maps found in /proc/PID/maps
Do you know what can cause this effect or in what conditions this can happen?
An executable can trivially mmap
(parts of) itself. This could be done to e.g. examine its own symbol table (necessary to print crash stack trace), or to extract some embedded resource.
The maps for the executable (python3.9) appear first and the map for a shared library that is opened appear after the ones in the executable.
This is only true by accident, and only for non-PIE executables.
Non-PIE executables on x86_64
are traditionally linked to load at address 0x400000
, and the shared libraries are normally loaded starting from below the main stack.
If you link a non-PIE executable to load at e.g. 0x7ff000000000
, then it will likely appear in the /proc/$pid/maps
after shared libraries.
Update:
the python binary here is certainly not mmapping itself, so that explanation doesn't apply
- You can't know that -- you almost certainly haven't read all the code in Python 3.9 and every module which you load.
- There is no need to guess where these
mmap
ed regions are coming from, you can just look.
To look, run your program under GDB and use catch syscall mmap
followed by where
. This will allow you to see where each and every mapping came from.
Related Topics
How to Write a Bash Script to Restart a Process If It Dies
How to Search For a Multiline Pattern in a File
What Does the Number in Parentheses Shown After Unix Command Names in Manpages Mean
Cronjob Does Not Execute a Script That Works Fine Standalone
How to Delete a Newline If It Is the Last Character in a File
Forward Host Port to Docker Container
What Is the Maximum Size of a Linux Environment Variable Value
How to Find Out Which Processes Are Using Swap Space in Linux
How to Find All Serial Devices (Ttys, Ttyusb, ..) on Linux Without Opening Them
Determine Direct Shared Object Dependencies of a Linux Binary
How to Send a File as an Email Attachment Using Linux Command Line
What's the Difference of Section and Segment in Elf File Format
What Is the Runtime Performance Cost of a Docker Container
How to Compile a 32-Bit Binary on a 64-Bit Linux Machine With Gcc/Cmake
The Bash Command :(){ :|:& };: Will Spawn Processes to Kernel Death. Can You Explain the Syntax