How to Measure Net Used Disk Space Change Due to Activity by a Given Process in Linux

How do I measure net used disk space change due to activity by a given process in Linux?

Probably you'll have to ptrace it (or get strace to do it for you and parse the output), and then try to work out what disc is being used.

This is nontrivial, as your tracing process will need to understand which file operations use disc space - and be free of race conditions. However, you might be able to do an approximation.

Quite a lot of things can use up disc space, because most Linux filesystems support "holes". I suppose you could count holes as well for accounting purposes.

Another problem is knowing what filesystem operations free up disc space - for example, opening a file for writing may, in some cases, truncate it. This clearly frees up space. Likewise, renaming a file can free up space if it's renamed over an existing file.

Another issue is processes which invoke helper processes to do stuff - for example if myprog does a system("rm -rf somedir").

Also it's somewhat difficult to know when a file has been completely deleted, as it might be deleted from the filesystem but still open by another process.

Happy hacking :)

Measuring peak disk use of a process

You may like to have a look at filetop from BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more:

tools/filetop: File reads and writes by filename and process. Top for files.

This script works by tracing the vfs_read() and vfs_write() functions using kernel dynamic tracing, which instruments explicit read and write calls. If files are read or written using another means (eg, via mmap()), then they will not be visible using this tool.

Brendan Gregg gives good talks and demos about Linux Performance Tools, they are quite instructive.

Net Usage (%) of a Process in Linux

While using tracemode nethogs -t, first field of output is program and it can consists of irregular number of arguments.
In case of brave:

/usr/lib/brave-bin/brave --type=utility --utility-sub-type=network.mojom.NetworkService --field-trial-handle=18208005703828410459,4915436466583499460,131072 --enable-features=AutoupgradeMixedContent,DnsOverHttps,LegacyTLSEnforced,PasswordImport,PrefetchPrivacyChanges,ReducedReferrerGranularity,SafetyTip,WebUIDarkMode --disable-features=AutofillEnableAccountWalletStorage,AutofillServerCommunication,DirectSockets,EnableProfilePickerOnStartup,IdleDetection,LangClientHintHeader,NetworkTimeServiceQuerying,NotificationTriggers,SafeBrowsingEnhancedProtection,SafeBrowsingEnhancedProtectionMessageInInterstitials,SharingQRCodeGenerator,SignedExchangePrefetchCacheForNavigations,SignedExchangeSubresourcePrefetch,SubresourceWebBundles,TabHoverCards,TextFragmentAnchor,WebOTP --lang=en-US --service-sandbox-type=none --shared-files=v8_context_snapshot_data:100/930/1000   0.0554687   0.0554687

so $3 will no longer be as expected, you need to get last column of output using $(NF) as follow:

... | awk /$ProcessName/'{print $(NF)}'

for second last column:

... | awk /$ProcessName/'{print $(NF - 1)}'

Programmatic resource monitoring per process in Linux

/usr/src/linux/Documentation/accounting/taskstats.txt

Taskstats is a netlink-based interface for sending per-task and
per-process statistics from the kernel to userspace.

Taskstats was designed for the following benefits:

  • efficiently provide statistics during lifetime of a task and on its exit
  • unified interface for multiple accounting subsystems
  • extensibility for use by future accounting patches

This interface lets you monitor CPU, memory, and I/O usage by processes of your choosing. You only need to set up and receive messages on a single socket.

This does not differentiate (for example) disk I/O versus network I/O. If that's important to you, you might go with a LD_PRELOAD interception library that tracks socket operations. Assuming that you can control the startup of the programs you wish to observe and that they won't do trickery behind your back, of course.

I can't think of any light-weight solutions if those still fail, but linux-audit can globally trace syscalls, which seems a fair bit more direct than re-capturing and analyzing your own network traffic.

How to find out which processes are using swap space in Linux?

Run top then press OpEnter. Now processes should be sorted by their swap usage.

Here is an update as my original answer does not provide an exact answer to the problem as pointed out in the comments. From the htop FAQ:

It is not possible to get the exact size of used swap space of a
process. Top fakes this information by making SWAP = VIRT - RES, but
that is not a good metric, because other stuff such as video memory
counts on VIRT as well (for example: top says my X process is using
81M of swap, but it also reports my system as a whole is using only 2M
of swap. Therefore, I will not add a similar Swap column to htop
because I don't know a reliable way to get this information (actually,
I don't think it's possible to get an exact number, because of shared
pages).

How to get process or port Network bandwidth usage in linux

As far as I know Linux doesn't offer an alternative interface to pcap for calculating network usage. /proc/<PID>/stat(us) contains various process information but nothing about network access, only the total I/O usage including disk access.

Similarly, to know the port you have to read at least the IP header. Hence I assume it is not possible to speed this up significantly, because analyzing all packets in user space will always be slow. A kernel module for this task seems to be the only option.

How can I measure the actual memory usage of an application or process?

With ps or similar tools you will only get the amount of memory pages allocated by that process. This number is correct, but:

  • does not reflect the actual amount of memory used by the application, only the amount of memory reserved for it

  • can be misleading if pages are shared, for example by several threads or by using dynamically linked libraries

If you really want to know what amount of memory your application actually uses, you need to run it within a profiler. For example, Valgrind can give you insights about the amount of memory used, and, more importantly, about possible memory leaks in your program. The heap profiler tool of Valgrind is called 'massif':

Massif is a heap profiler. It performs detailed heap profiling by taking regular snapshots of a program's heap. It produces a graph showing heap usage over time, including information about which parts of the program are responsible for the most memory allocations. The graph is supplemented by a text or HTML file that includes more information for determining where the most memory is being allocated. Massif runs programs about 20x slower than normal.

As explained in the Valgrind documentation, you need to run the program through Valgrind:

valgrind --tool=massif <executable> <arguments>

Massif writes a dump of memory usage snapshots (e.g. massif.out.12345). These provide, (1) a timeline of memory usage, (2) for each snapshot, a record of where in your program memory was allocated. A great graphical tool for analyzing these files is massif-visualizer. But I found ms_print, a simple text-based tool shipped with Valgrind, to be of great help already.

To find memory leaks, use the (default) memcheck tool of valgrind.

How can I get the memory of one process in Linux

The short answer is that on a modern operating system this is very difficult.

Memory that is free()ed is not actually returned to the OS
until the process terminates, so many cycles of allocating
and freeing progressively bigger chunks of memory will cause
the process to grow. (via)

This question has been already answered in more detail on another SO thread. You might find your answer there.



Related Topics



Leave a reply



Submit