What Do the Counters in /Proc/[Pid]/Io Mean

What do the counters in /proc/[pid]/io mean?

While the proc manpage is woefully behind (and so are most manpages/documentation on anything not relating to cookie-cutter user-space development), this stuff is fortunately documented completely in the Linux kernel source under Documentation/filesystems/proc.rst. Here are the relevant bits:

rchar
-----

I/O counter: chars read
The number of bytes which this task has caused to be read from storage. This
is simply the sum of bytes which this process passed to read() and pread().
It includes things like tty IO and it is unaffected by whether or not actual
physical disk IO was required (the read might have been satisfied from
pagecache)

wchar
-----

I/O counter: chars written
The number of bytes which this task has caused, or shall cause to be written
to disk. Similar caveats apply here as with rchar.

read_bytes
----------

I/O counter: bytes read
Attempt to count the number of bytes which this process really did cause to
be fetched from the storage layer. Done at the submit_bio() level, so it is
accurate for block-backed filesystems. <please add status regarding NFS and
CIFS at a later time>

write_bytes
-----------

I/O counter: bytes written
Attempt to count the number of bytes which this process caused to be sent to
the storage layer. This is done at page-dirtying time.

Does RCHAR include READ_BYTES (proc/pid/io)?

I can only think of two things:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentation/filesystems/proc.txt;hb=HEAD#l1305

1:

1446 read_bytes
1447 ----------
1448
1449 I/O counter: bytes read
1450 Attempt to count the number of bytes which this process really did cause to
1451 be fetched from the storage layer.

I read "Caused to be fetched from the storage layer" to include readahead, whatever.

2:

1411 rchar
1412 -----
1413
1414 I/O counter: chars read
1415 The number of bytes which this task has caused to be read from storage. This
1416 is simply the sum of bytes which this process passed to read() and pread().
1417 It includes things like tty IO and it is unaffected by whether or not actual
1418 physical disk IO was required (the read might have been satisfied from
1419 pagecache)

Note that this says nothing about "disk access via memory mapped files". I think this is the more likely reason, and that your MonetDB probably mmaps out its database files and then does everything on them.

I'm not really sure how you could check the used bandwidth on mmap, because of its nature.

What is the reason for the discrepancy between /proc/[pid]/status:RssAnon and /proc/[pid]/smaps_rollup:Anonymous?

The plausible theory you mention is true. The man pages have been updated: https://www.spinics.net/lists/linux-mm/msg230450.html

Since 34e55232e59f7b19050267a05ff1226e5cd122a5 (introduced back in
v2.6.34), Linux uses per-thread RSS counters to reduce cache contention on
the per-mm counters. With a 4K page size, that means that you can end up
with the counters off by up to 252KiB* per thread.

(*this precise number is not strictly accurate, see thread for details)

some uid's in /proc/pid/loginuid are strange

4294967295 is just (unsigned long) -1. -1 means that loginuid was not set. This is normal behavior for processes that were not spawned by any login process (e.g. for daemons).
loginuid is -1 by default; pam_loginuid module changes it to your user id whenever you login (in a tty/in DM/via ssh), and this value is preserved by child processes.

What deleted means in /proc/$pid/maps?

Temporary file deletion (unlinking) is normal for libhugetlbfs when it uses hugetlb fs pseudo filesystem (grep hugetlbfs /proc/filesystems) for getting mmaps backed with hugetlb pages.

For example, there is hugetlbfs_unlinked_fd function of libhugetlbfs/hugeutils.c
https://github.com/libhugetlbfs/libhugetlbfs/blob/e44180072b796c0e28e53c4d01ef6279caaa2a99/hugeutils.c#L1033

int hugetlbfs_unlinked_fd_for_size(long page_size)
{
const char *path;
char name[PATH_MAX+1];
int fd;

path = hugetlbfs_find_path_for_size(page_size);
..
name[sizeof(name)-1] = '\0';

strcpy(name, path);
strncat(name, "/libhugetlbfs.tmp.XXXXXX", sizeof(name)-1);
/* FIXME: deal with overflows */

fd = mkstemp64(name);
....

unlink(name);

return fd;
}

Temporary file name is randomly generated in mkstemp function; it also creates the file and opens it. Then this file is unlinked (man 2 unlink) from the filesystem (file name is marked as deleted in the directory, there is still inode and file data, but other programs can't access this file by name).

While unlinked fd is opened, it can be used to work with hugetlb mmap and to store data. Only when this fd is closed, file data will be actually deleted by fs.

Early unlinking of mktemp files is often used: When a file created with mkstemp() is deleted?

Some useful information can be also listed in HOWTO of libhugetlbfs project
https://github.com/libhugetlbfs/libhugetlbfs/blob/master/HOWTO



Related Topics



Leave a reply



Submit