Will Data Written via Write() Be Flushed to Disk If a Process Is Killed

Will data written via write() be flushed to disk if a process is killed?

Normally, the data already written by the application into a kernel buffer with write() will not be affected by the application exiting or getting killed in any way. Exiting or getting killed implicitly closes all file descriptors, so there should be no difference, the kernel will handle the flushing afterwards. So no fdatasync() or similar calls are neccessary.

There are two exceptions to this:

  • if the application uses user-land buffering (not calling the write() system call, but instead caching the data in a user-space buffer, with fwrite()), those buffers might not get flushed unless a proper user-space file close is executed - getting killed by a SIGKILL will definitely cause you to lose the contents of those buffers,

  • if the kernel dies as well (loss of power, kernel crash, etc.), your data might have missed getting written to the disks from the kernel buffers, and will then get lost.

How to force a running program to flush the contents of its I/O buffers to disk with external means?

Can I just add some clarity? Obviously months have passed, and I imagine your program isn't running any more ... but there's some confusion here about buffering which still isn't clear.

As soon as you use the stdio library and FILE *, you will by default have a fairly small (implementation dependent, but typically some KB) buffer inside your program which is accumulating what you write, and flushing it to the OS when it's full, (or on file close). When you kill your process, it is this buffer that gets lost.

If the data has been flushed to the OS, then it is kept in a unix file buffer until the OS decides to persist it to disk (usually fairly soon), or someone runs the sync command. If you kill the power on your computer, then this buffer gets lost as well. You probably don't care about this scenario, because you probably aren't planning to yank the power! But this is what @EJP was talking about (re Stuff that is in the OS buffer/disk cache' won't be lost): your problem is the stdio cache, not the OS.

In an ideal world, you'd write your app so it fflushed (or std::flush()) at key points. In your example, you'd say:

    if (i == 0) {//This case is interesting!
fprintf(filept, "Hello world\n");
fflush(filept);
}

which would cause the stdio buffer to flush to the OS. I imagine your real writer is more complex, and in that situation I would try to make the fflush happen "often but not too often". Too rare, and you lose data when you kill the process, too often and you lose the performance benefits of buffering if you are writing a lot.

In your described situation, where the program is already running and can't be stopped and rewritten, then your only hope, as you say, is to stop it in a debugger. The details of what you need to do depend on the implementation of the std lib, but you can usually look inside the FILE *filept object and start following pointers, messy though. @ivan_pozdeev's comment about executing std::flush() or fflush() within the debugger is helpful.

Does the OS (POSIX) flush a memory-mapped file if the process is SIGKILLed?

It will depend on whether the memory-mapped file is opened with modifications private (MAP_PRIVATE) or not (MAP_SHARED). If private, then no; the modifications will not be written back to disk. If shared, the kernel buffer pool contains the modified buffers, and these will be written to disk in due course - regardless of the cause of death.

How are file objects cleaned up in Python when the process is killed?

It's not how files are "cleaned up" so much as how they are written to. It's possible that a program might perform multiple writes for a single "chunk" of data (row, or whatever) and you could interrupt in the middle of this process and end up with partial records written.

Looking at the C source for the csv module, it assembles each row to a string buffer, then writes that using a single write() call. That should generally be safe; either the row is passed to the OS or it's not, and if it gets to the OS it's all going to get written or it's not (barring of course things like hardware issues where part of it could go into a bad sector).

The writer object is a Python object, and a custom writer could do something weird in its write() that could break this, but assuming it's a regular file object, it should be fine.

Does the OS (POSIX) finish a modification to a memory-mapped file if the process is SIGKILLed?

Let's say you have something like

volatile unsigned char  *map; /* memory-mapped file */
size_t i;

for (i = 0; i < 1000; i++)
map[i] = slow_calculation(i);

and for some reason, the process gets killed when i = 502.

In such a case, the contents of the file will indeed reflect the content of the mapping at that point.

No, there is no way to avoid this (with regards to the KILL signal), because KILL is unblockable and uncatchable.

You can minimize the window by using a temporary buffer as a "transactional" buffer, calculating the new values to that buffer, and then just copy the values over. It is no guarantee, but it does mean there is a much higher probability that the file contents are intact even if the process is killed. (Furthermore, it means that if you use e.g. mutexes to synchronize access to the mapping, you only need to hold the mutex for the minimum amount of time.)

Killing a process via the KILL signal is very abnormal termination, and having memory-mapped files garbled because of that is, in my opinion, expected. It is not something that should be done during normal operation at all; the TERM signal is used for that.

What you should worry about, is that your process responds to a TERM signal in a timely fashion. TERM is catchable and blockable, and is basically a way for an external supervisor process (or user the process belongs to, or the superuser) to request the process exit cleanly as soon as possible. However, the process should not dally around, because it is quite common to send the process a KILL signal, if it doesn't exit within a few seconds after receiving a TERM signal.

In my own daemons, I strive for them to respond to a TERM within a second or so, unless the system is under a heavy load. It is, of course, a very subjective measurement since the speed of different systems varies, but there are no hard and fast rules here.

One way to handle this, is to install a TERM signal handler that in normal operation, does terminate the process immediately. For critical sections, the exit is postponed:

static volatile int  in_critical = 0;
static volatile int need_to_exit = 0;

static void handle_exit_signal(int signum)
{
__atomic_store_n(&need_to_exit, 1, __ATOMIC_SEQ_CST);
if (!__atomic_load_n(&in_critical, __ATOMIC_SEQ_CST))
exit(126);
}

static int install_exit(int signum)
{
struct sigaction act;
memset(&act, 0, sizeof act);
sigemptyset(&act.sa_mask);
act.sa_handler = handle_exit_signal;
act.sa_flags = SA_RESTART;
if (sigaction(signum, &act, NULL) == -1)
return errno;
return 0;
}

To enter and exit critical sections (say, when you hold a mutex within the shared memory region):

static inline void critical_begin(void)
{
__atomic_add_fetch(&in_critical, 1, __ATOMIC_SEQ_CST);
}

static inline void critical_end(void)
{
if (!__atomic_sub_fetch(&in_critical, 1, __ATOMIC_SEQ_CST))
if (__atomic_load_n(&need_to_exit, __ATOMIC_SEQ_CST))
exit(126);
}

So, if a TERM signal is received while you are in a critical section (and critical_begin() and critical_end() do nest), the final call to critical_end() exits the process.

Note that I used the GCC atomic built-ins for managing the flags atomically, without data races, even if the signal handler is executed in a different thread. I've found this the cleanest solution for linux, although it should work on other OSes too. (Other C compilers you can use in Linux, like clang and Intel CC, do support those, too.)

So, in pseudocode, doing the slow 1000-element calculation as shown in the beginning, would then be

volatile unsigned char  *map;
unsigned char cache[1000];
size_t i;

/* Nothing critical yet, we're just calculating new values... */
for (i = 0; i < 1000; i++)
cache[i] = slow_calculation(i);

/* Update shared memory map. */
critical_begin();
/* pthread_mutex_lock() */
memcpy(map, cache, 1000);
/* pthread_mutex_unlock() */
critical_end();

If a TERM signal is delivered before the critical_begin(), the process is terminated then and there. If a TERM signal is delivered after that, but before the critical_end(), the call to critical_end() will terminate the process.

This is just one pattern that can solve the underlying problem; there are others. The one with a single volatile sig_atomic_t done = 0; that the signal handler sets to nonzero, and the main processing loops check regularly, is even more common.

As pointed out by R.. in a comment, the pointer used to refer to the memory map should be a pointer to volatile (i.e., volatile some_type *map) to stop the compiler from reordering the stores to the memory map.

Does Linux guarantee the contents of a file is flushed to disc after close()?

From "man 2 close":

A successful close does not guarantee that the data has been
successfully saved to disk, as the
kernel defers writes.

The man page says that if you want to be sure that your data are on disk, you have to use fsync() yourself.

File flush needed after process exit?

I've never had your problem and always found a call to close() to be sufficient. However, from the man entry on close(2):

A successful close does not guarantee that the data has been successfully saved to disk, as the kernel defers writes. It is not common for a file system to flush the buffers when the stream is closed. If you need to be sure that the data is physically stored use fsync(2). (It will depend on the disk hardware at this point.)

As, at time of writing, you haven't included code for the write processes I can only suggest adding a call to fsync in that process and see if this makes a difference.

File is not written on disk until program ends

If there's some time between the fputs and fclose, add

fflush(fp);

This will cause the contents of the disk file to be written.

Close file after a kill command

Your question is tagged with "linux", so unless you buffer your writes and don't flush the buffer (e.g. when using fwrite you need to call fflush sometimes), the contents will be written to the file because it will be properly closed on exit (even if the exit is forced by a signal). You don't need fsync unless you're doing something that has to survive a machine crash (and then you need to know what you're doing to get crash semantics right).

Since you mentioned close in what you want to do in the signal handler, it doesn't seem you're buffering your writes, so you don't need to do anything. Data written with successful calls to write will end up in the file unless your machine crashes (or your disk/filesystems break before flushing the buffer cache, don't worry about that). In fact, the moment write returns in your program, the data can be considered written into the file and will be visible by other processes that read that file (an exception to this is if the machine crashes or a few filesystem edge cases, but that's a much more complex topic).

If all you do in your signal handlers is to close your file descriptors and _exit then you don't need the signal handlers.

Will StreamWriter.Flush() also call FileStream.Flush()?

Yes, calling Flush on StreamWriter will cause the underlying stream to be Flushed. The 4.5 version calls a private Flush(bool,bool) function, which ends with:

if (flushStream)
{
this.stream.Flush();
}

Where flushStream is the first parameter, this.stream is the stream that the StreamWriter was constructed on, and the call in Flush() is Flush(true,true).


(Older parts of answer - I was being very roundabout in answering. Moved most relevant part of answer to top)

It's not explicitly spelled out in the documentation anywhere I can find it, but any stream class that is constructed by passing it another stream should be assumed to "take ownership" of that stream (unless it's specifically called out otherwise).

That is, once you've constructed the StreamWriter using fs, you shouldn't perform any direct actions on fs yourself.


The part you quoted from MSDN relates to the later sentences:

This allows the encoder to keep its state (partial characters) so that it can encode the next block of characters correctly. This scenario affects UTF8 and UTF7 where certain characters can only be encoded after the encoder receives the adjacent character or characters.

That is, you may have passed data to Write such that you've given it some Unicode surrogates, but not a complete character. Flush will not write those surrogates to the stream. So long as you're always passing well formed (complete) strings to Write, you do not need to concern yourself about this.




Related Topics



Leave a reply



Submit