How to Record What Process or Kernel Activity Is Using the Disk in Gnu/Linux

How to make Linux GUI usable when lots of disk activity is happening

Try ionice-ing or nice-ing the copy process. The issue is due to the fact that IO gets the same priority as the GUI, which for a desktop, affects perceived responsiveness.

There's an Ubuntu brainstorm about this currently.

How to measure IOPS for a command in linux?

There are multiple time(1) commands on a typical Linux system; the default is a bash(1) builtin which is somewhat basic. There is also /usr/bin/time which you can run by either calling it exactly like that, or telling bash(1) to not use aliases and builtins by prefixing it with a backslash thus: \time. Debian has it in the "time" package which is installed by default, Ubuntu is likely identical, and other distributions will be quite similar.

Invoking it in a similar fashion to the shell builtin is already more verbose and informative, albeit perhaps more opaque unless you're already familiar with what the numbers really mean:

$ \time df
[output elided]
0.00user 0.00system 0:00.01elapsed 66%CPU (0avgtext+0avgdata 864maxresident)k
0inputs+0outputs (0major+261minor)pagefaults 0swaps

However, I'd like to draw your attention to the man page which lists the -f option to customise the output format, and in particular the %w format which counts the number of times the process gave up its CPU timeslice for I/O:

$ \time -f 'ios=%w' du Maildir >/dev/null
ios=184
$ \time -f 'ios=%w' du Maildir >/dev/null
ios=1

Note that the first run stopped for I/O 184 times, but the second run stopped just once. The first figure is credible, as there are 124 directories in my ~/Maildir: the reading of the directory and the inode gives roughly two IOPS per directory, less a bit because some inodes were likely next to each other and read in one operation, plus some extra again for mapping in the du(1) binary, shared libraries, and so on.

The second figure is of course lower due to Linux's disk cache. So the final piece is to flush the cache. sync(1) is a familiar command which flushes dirty writes to disk, but doesn't flush the read cache. You can flush that one by writing 3 to /proc/sys/vm/drop_caches. (Other values are also occasionally useful, but you want 3 here.) As a non-root user, the simplest way to do this is:

echo 3 | sudo tee /proc/sys/vm/drop_caches

Combining that with /usr/bin/time should allow you to build the scripts you need to benchmark the commands you're interested in.

As a minor aside, tee(1) is used because this won't work:

sudo echo 3 >/proc/sys/vm/drop_caches

The reason? Although the echo(1) runs as root, the redirection is as your normal user account, which doesn't have write permissions to drop_caches. tee(1) effectively does the redirection as root.

How to find out which processes are using swap space in Linux?

Run top then press OpEnter. Now processes should be sorted by their swap usage.

Here is an update as my original answer does not provide an exact answer to the problem as pointed out in the comments. From the htop FAQ:

It is not possible to get the exact size of used swap space of a
process. Top fakes this information by making SWAP = VIRT - RES, but
that is not a good metric, because other stuff such as video memory
counts on VIRT as well (for example: top says my X process is using
81M of swap, but it also reports my system as a whole is using only 2M
of swap. Therefore, I will not add a similar Swap column to htop
because I don't know a reliable way to get this information (actually,
I don't think it's possible to get an exact number, because of shared
pages).

How can I measure the actual memory usage of an application or process?

With ps or similar tools you will only get the amount of memory pages allocated by that process. This number is correct, but:

does not reflect the actual amount of memory used by the application, only the amount of memory reserved for it
can be misleading if pages are shared, for example by several threads or by using dynamically linked libraries

If you really want to know what amount of memory your application actually uses, you need to run it within a profiler. For example, Valgrind can give you insights about the amount of memory used, and, more importantly, about possible memory leaks in your program. The heap profiler tool of Valgrind is called 'massif':

Massif is a heap profiler. It performs detailed heap profiling by taking regular snapshots of a program's heap. It produces a graph showing heap usage over time, including information about which parts of the program are responsible for the most memory allocations. The graph is supplemented by a text or HTML file that includes more information for determining where the most memory is being allocated. Massif runs programs about 20x slower than normal.

As explained in the Valgrind documentation, you need to run the program through Valgrind:

valgrind --tool=massif <executable> <arguments>

Massif writes a dump of memory usage snapshots (e.g. massif.out.12345). These provide, (1) a timeline of memory usage, (2) for each snapshot, a record of where in your program memory was allocated. A great graphical tool for analyzing these files is massif-visualizer. But I found ms_print, a simple text-based tool shipped with Valgrind, to be of great help already.

To find memory leaks, use the (default) memcheck tool of valgrind.

How to Record What Process or Kernel Activity Is Using the Disk in Gnu/Linux

How to make Linux GUI usable when lots of disk activity is happening

How to measure IOPS for a command in linux?

How to find out which processes are using swap space in Linux?

How can I measure the actual memory usage of an application or process?

Related Topics

Leave a reply