Timestamp Accuracy on Ext4 (Sub Millsecond)

timestamp accuracy on EXT4 (sub millsecond)

The ext4 file system does support nanosecond resolution on stored times if the inodes are big enough to support the extended time information (256 bytes or larger). In your case, since there is greater than second resolution, this is not a problem.

Internally, the ext4 filesystem code calls current_fs_time() which is the current cached kernel time truncated to the time granularity specified in the file system's superblock which for ext4 is 1ns.

The current time within the Linux kernel is cached, and generally only updated on a timer interrupt. So if your timer interrupt is running at 10 milliseconds, the cached time will only be updated once every 10 milliseconds. When an update does occur, the accuracy of the resulting time will depend on the clock source available on your hardware.

Try this and see if you also get similar results to your stat calls:

while true; do date --rfc-3339=ns; done

On my machine (amd64, intel virtualbox) there is no quantization.

eg

2013-01-18 17:04:21.097211836+11:00
2013-01-18 17:04:21.098354731+11:00
2013-01-18 17:04:21.099282128+11:00
2013-01-18 17:04:21.100276327+11:00
2013-01-18 17:04:21.101348507+11:00
2013-01-18 17:04:21.102516837+11:00

Update:

The above check using date doesn't really show anything for this situation. This is because date will call the gettimeofday system call which will always return the most accurate time available based on the cached kernel time, adjusted by the CPU cycle time if available to give nanosecond resolution. The timestamps stored in the file system however, are only based on the cached kernel time. ie The time calculated at the last timer interrupt.

How to increase the EXT4 timestamp precision in linux?

ext4 is supposed to support finer timestamps; it can depend on how the system was formatted:

  • timestamp accuracy on EXT4 (sub millsecond)
  • Determine file system timestamp accuracy

touch timestamp accuracy on EXT4

I used this bunch of (admittedly quick & dirty) oneliners to test your issue on my system - a Mandriva Linux 2010.1 (x86-64):

seq 1 1000 | while read f; do sleep 0.01; touch test-$f-0; done

seq 1 1000 | while read f; do touch -a -d "$(stat -c %x test-$f-0 | sed 's|^2010|2012|')" test-$f-1; done

seq 1 1000 | while read f; do A="$(stat -c %x test-$f-0)"; B="$(stat -c %x test-$f-1)"; if [[ ! "${A#2010}" = "${B#2012}" ]]; then echo test-$f; fi; done

I was unable to reproduce your issue even once. It sounds like touch is not fed the expected timestamp at the -d parameter, but something computed otherwise.

Of course the issue could be system-specific, in which case we'd need more information on your system (CPU, is the OS 32 or 64 bit, kernel/glibc/coreutils versions etc).

UPDATE:

I tried the same with 32-bit versions of stat and touch. No issues came up. The kernel was still an 64-bit one.

UPDATE2:

I also tried this set of oneliners, that focus more on atime:

$ seq 1 1000 | while read f; do sleep 0.01; touch test-$f-0; done
$ seq 1 1000 | while read f; do sleep 0.01; touch test-$f-1; done
$ seq 1 1000 | while read f; do sleep 0.01; cat test-$f-0; done
$ seq 1 1000 | while read f; do touch -a -d "$(stat -c %x test-$f-0 | sed 's|^2010|2012|')" test-$f-1; done
$ seq 1 1000 | while read f; do A="$(stat -c %x test-$f-0)"; B="$(stat -c %x test-$f-1)"; if [[ ! "${A#2010}" = "${B#2012}" ]]; then echo test-$f; fi; done

Again no issue detected. I tried this with both the relatime and strictatime mount options.

UPDATE3:

I just got to perform the tests above on my Mandriva i686 laptop. I seem to get no issues with nanosecond accuracy there either. I also verified on another 32bit system that if nanosecond accuracy is not supported (e.g. on ext3), the nanosecond field in the stat output becomes zero.

Why are these timestamps out of order with Perl Time::HiRes?

This might be because of a difference in the precision of both timestamps as mentioned in the doc here:

As stat or lstat but
with the access/modify/change file timestamps in subsecond resolution,
if the operating system and the filesystem both support such
timestamps. To override the standard stat():

use Time::HiRes qw(stat);

Test for the value of &Time::HiRes::d_hires_stat to find out whether the operating system
supports subsecond file timestamps: a value larger than zero means
yes. There are unfortunately no easy ways to find out whether the
filesystem supports such timestamps. UNIX filesystems often do; NTFS
does; FAT doesn't (FAT timestamp granularity is two seconds).

A zero
return value of &Time::HiRes::d_hires_stat means that
Time::HiRes::stat is a no-op passthrough for CORE::stat() (and
likewise for lstat), and therefore the timestamps will stay integers.
The same thing will happen if the filesystem does not do subsecond
timestamps, even if the &Time::HiRes::d_hires_stat is non-zero.

In any
case do not expect nanosecond resolution, or even a microsecond
resolution. Also note that the modify/access timestamps might have
different resolutions, and that they need not be synchronized, e.g. if
the operations are

write
stat # t1
read
stat # t2

the access time stamp from t2 need not be greater-than the modify
time stamp from t1: it may be equal or less.

ext4 Specifications?

I've been using ext4 Disk Layout @ kernel.org. Requires careful reading and some extra research on subjects but for the most part it is complete.

Is it unhealthy for SSD if I write 'vital signal' to check a python code is running?

Doing that will degrade the SSD and destroy it over time.

In my last job, the SSD health tool (smartctl) indicated that the 15 SSDs in our cluster product were wearing rapidly and had only months of life left. The team found that a third party software package (etcd) was syncing a small amounts of data to a filesystem on SSD once per second. And each sync wrote at least an entire 16K block. Luckily, the problem was found early enough that we could patch it in a software update before suffering too many customer returns.

Write the 'vitality' file somewhere else. It could be on a tmpfs like /var/run/user/. Or use a different vitality mechanism; something like supervisord can manage your task, run health checks and restart it on failure.



Related Topics



Leave a reply



Submit