How to Trace Per-File Io Operations in Linux

How to trace per-file IO operations in Linux?

First, you probably don't need to keep track because mapping between fd and path is available in /proc/PID/fd/.

Second, maybe you should use the LD_PRELOAD trick and overload in C open, seek and read system call. There are some article here and there about how to overload malloc/free.

I guess it won't be too different to apply the same kind of trick for those system calls. It needs to be implemented in C, but it should take far less code and be more precise than parsing strace output.

Linux - How to track all files accessed by a process?


lsof:

Try doing this as a starter :

lsof -p <PID>

this command will list all currently open files, fd, sockets for the process with the passed process ID.

For your special needs, see what I can offer as a solution to monitor a php script :

php foo.php & _pid=$!
lsof -r1 -p $_pid
kill %1 # if you want to kill php script

strace:

I recommend the use of strace. Unlike lsof, it stays running for as long as the process is running. It will print out which syscalls are being called when they are called. -e trace=file filters only for syscalls that access the filesystem:

sudo strace -f -t -e trace=file php foo.php

or for an already running process :

sudo strace -f -t -e trace=file -p <PID>

Tracking a program's progress reading through a file?

You can do what you want with the progress command. It shows the progress of coreutils tools such as cat or other programs in reading their file.

File and offset information is available in Linux in /proc/<PID>/fd and /proc/<PID>/fdinfo.

how to find the process that is doing io frequently?

You can use iotop to find processes that are io heavy.

What Process is using all of my disk IO

You're looking for iotop (assuming you've got kernel >2.6.20 and Python 2.5). Failing that, you're looking into hooking into the filesystem. I recommend the former.

Calculate Total disk i/o by a single process

Feel free to play with this scribble (myio.sh):

#!/bin/bash 

TEMPFILE=$(tempfile) # create temp file for results

trap "rm $TEMPFILE; exit 1" SIGINT # cleanup after Ctrl+C

SECONDS=0 # reset timer

$@ & # execute command in background

IO=/proc/$!/io # io data of command
while [ -e $IO ]; do
cat $IO > "$TEMPFILE" # "copy" data
sed 's/.*/& Bytes/' "$TEMPFILE" | column -t
echo
sleep 1
done

S=$SECONDS # save timer

echo -e "\nPerformace after $S seconds:"
while IFS=" " read string value; do
echo $string $(($value/1024/1024/$S)) MByte/s
done < "$TEMPFILE" | column -t

rm "$TEMPFILE" # remove temp file

Syntax: ./myio.sh <your command>

Examples:

  • ./myio.sh dd if=/dev/zero of=/dev/null bs=1G count=4096
  • as root: ./myio.sh dd if=/dev/sda1 of=/dev/null bs=1M count=4096

Please change dd's of= in last example only if you know what you are doing.


With this simple script from me you can watch an already running process and its IO.

Syntax: pio.sh PID

#!/bin/bash

[ "$1" == "" ] && echo "Error: Missing PID" && exit 1
IO=/proc/$1/io # io data of PID
[ ! -e "$IO" ] && echo "Error: PID does not exist" && exit 2
I=3 # interval in seconds
SECONDS=0 # reset timer

echo "Watching command $(cat /proc/$1/comm) with PID $1"

IFS=" " read rchar wchar syscr syscw rbytes wbytes cwbytes < <(cut -d " " -f2 $IO | tr "\n" " ")

while [ -e $IO ]; do
IFS=" " read rchart wchart syscrt syscwt rbytest wbytest cwbytest < <(cut -d " " -f2 $IO | tr "\n" " ")

S=$SECONDS
[ $S -eq 0 ] && continue

cat << EOF
rchar: $((($rchart-$rchar)/1024/1024/$S)) MByte/s
wchar: $((($wchart-$wchar)/1024/1024/$S)) MByte/s
syscr: $((($syscrt-$syscr)/1024/1024/$S)) MByte/s
syscw: $((($syscwt-$syscw)/1024/1024/$S)) MByte/s
read_bytes: $((($rbytest-$rbytes)/1024/1024/$S)) MByte/s
write_bytes: $((($wbytest-$wbytest)/1024/1024/$S)) MByte/s
cancelled_write_bytes: $((($cwbytest-$cwbytes)/1024/1024/$S)) MByte/s
EOF
echo
sleep $I
done


Related Topics



Leave a reply



Submit