To What Extent Can One Rely on Writing to Disk When Sigterm Is Given

To What Extent Can One Rely on Writing to Disk When SIGTERM is Given?

Read carefully signal(7) - so you can't fprintf from a signal handler. Often the most sensible thing to do is to set some volatile sigatomic_t variable in the signal handler, and test that variable outside it.

The point is not only to write(2) some data to a file system. It is to get the data written to the disk (it might stay in kernel filesystem buffers). Read sync(2) & fsync(2).

You cannot (in all cases) be sure that data is written to the disk (especially on power outage).

I would not bother and use syslog(3) (i.e. have my SIGTERM handler set a volatile sigatomic_t flag, and later test that flag elsewhere; on termination call syslog...). Then document that your program is writing to the system log on termination, and leave the responsability of ensuring that the system log is written (to some disk somewhere, perhaps on the network) to the sysadmin.

The concrete behavior of shutdown is mostly a sysadmin issue. It really depends upon the actual system (including linux distribution and hardware) and the sysadmin's skills. Some file systems are remote or (for cheap USB keys) very slow file systems (and writing to them can be lost).

Imagine also a system with a buggy daemon (not yours but something else) which take ages to terminate. Then your daemon might not have time to do something sensible. So you have to trust the sysadmin.

I don't understand why you are asking your question. It depends upon how the entire system is administrated and set up. It really is different on million dollars servers and on a laptop.

How are file objects cleaned up in Python when the process is killed?

It's not how files are "cleaned up" so much as how they are written to. It's possible that a program might perform multiple writes for a single "chunk" of data (row, or whatever) and you could interrupt in the middle of this process and end up with partial records written.

Looking at the C source for the csv module, it assembles each row to a string buffer, then writes that using a single write() call. That should generally be safe; either the row is passed to the OS or it's not, and if it gets to the OS it's all going to get written or it's not (barring of course things like hardware issues where part of it could go into a bad sector).

The writer object is a Python object, and a custom writer could do something weird in its write() that could break this, but assuming it's a regular file object, it should be fine.

What's the best way to send a signal to all members of a process group?

You don't say if the tree you want to kill is a single process group. (This is often the case if the tree is the result of forking from a server start or a shell command line.) You can discover process groups using GNU ps as follows:

 ps x -o  "%p %r %y %x %c "

If it is a process group you want to kill, just use the kill(1) command but instead of giving it a process number, give it the negation of the group number. For example to kill every process in group 5112, use kill -TERM -- -5112.

How to force a running program to flush the contents of its I/O buffers to disk with external means?

Can I just add some clarity? Obviously months have passed, and I imagine your program isn't running any more ... but there's some confusion here about buffering which still isn't clear.

As soon as you use the stdio library and FILE *, you will by default have a fairly small (implementation dependent, but typically some KB) buffer inside your program which is accumulating what you write, and flushing it to the OS when it's full, (or on file close). When you kill your process, it is this buffer that gets lost.

If the data has been flushed to the OS, then it is kept in a unix file buffer until the OS decides to persist it to disk (usually fairly soon), or someone runs the sync command. If you kill the power on your computer, then this buffer gets lost as well. You probably don't care about this scenario, because you probably aren't planning to yank the power! But this is what @EJP was talking about (re Stuff that is in the OS buffer/disk cache' won't be lost): your problem is the stdio cache, not the OS.

In an ideal world, you'd write your app so it fflushed (or std::flush()) at key points. In your example, you'd say:

    if (i == 0) {//This case is interesting!
        fprintf(filept, "Hello world\n");
        fflush(filept);
    }

which would cause the stdio buffer to flush to the OS. I imagine your real writer is more complex, and in that situation I would try to make the fflush happen "often but not too often". Too rare, and you lose data when you kill the process, too often and you lose the performance benefits of buffering if you are writing a lot.

In your described situation, where the program is already running and can't be stopped and rewritten, then your only hope, as you say, is to stop it in a debugger. The details of what you need to do depend on the implementation of the std lib, but you can usually look inside the FILE *filept object and start following pointers, messy though. @ivan_pozdeev's comment about executing std::flush() or fflush() within the debugger is helpful.

Can shutdown hooks be used for slightly longer task

This design document does not say anything about how long a shutdown hook thread is allowed to run. But reading between the lines (the fact that there is a Runtime.halt, I think you are safe.)

Now keep in mind that some other frameworks you rely on may have been shutting down concurrently, so your hook should not rely on anything other than your own code.

You may also find this other SO question of interest: How to stop java process gracefully?.

Is it possible to create a real nondetached (joinable) thread in iPhone OS?

I am not sure if using a joinable thread allows you to defer application termination until your thread is done and you join it. Probably not. You can use POSIX threads in place of NSThread or NSOperation/NSOperationQueue, but you will still have to take into account the possibility of the user terminating the application.

Now, termination may happen in one of two possible ways:

1) the app receives a SIGTERM signal that you can intercept through the applicationWillTerminate: method, which is then the SIGTERM signal handler;

2) the app receives SIGKILL: in this case you can't catch the signal through a signal handler and you are not allowed (of course) to ignore it setting the signal disposition.

If SIGTERM is the only signal sent to terminate the application, then you should be able to continue the work in your thread to save the data and exit gracefully. However, it may be that after a timeout elapses (the 5 seconds you mentioned), the app also receives SIGKILL and this event causes its immediate termination. This may be what actually happens when the user presses the home button: the iPhone OS sends SIGTERM to the app, fires a timer and when the timeout occurs it sends SIGKILL. But nothing in the documentation confirms this (or disproves it, to the best of my knowledge).

To manage this, you should do your best (application dependent of course) to save the app state atomically and as soon as possible, and you should be prepared to cancel your POSIX thread (when it reaches one of the possible cancellation points) and rollback as needed in order to cleanup before exiting.

Is it expensive/efficient to send data between processes in Node?

The documentation is telling you that starting new node processes is (relatively) expensive. It is unwise to fork() every time you need to do work.

Instead, you should maintain a pool of long-running worker processes – much like a thread pool. Queue work requests in your main process and dispatch them to the next available worker when it goes idle.

This leaves us with a question about the performance profile of node's IPC mechanism. When you fork(), node automatically sets up a special file descriptor on the child process. It uses this to communicate between processes by reading and writing line-delimited JSON. Basically, when you process.send({ ... }), node JSON.stringifys it and writes the serialized string to the fd. The receiving process reads this data until hitting a line break, then JSON.parses it.

This necessarily means that performance will be highly dependent on the size of the data you send between processes.

I've roughed out some tests to get a better idea of what this performance looks like.

First, I sent a message of N bytes to the worker, which immediately responded with a message of the same length. I tried this with 1 to 8 concurrent workers on my quad-core hyper-threaded i7.

graph

We can see that having at least 2 workers is beneficial for raw throughput, but more than 2 essentially doesn't matter.

Next, I sent an empty message to the worker, which immediately responded with a message of N bytes.

graph

Surprisingly, this made no difference.

Finally, I tried sending a message of N bytes to the worker, which immediately responded with an empty message.

graph

Interesting — performance does not degrade as rapidly with larger messages.

Takeaways

Receiving large messages is slightly more expensive than sending them. For best throughput, your master process should not send messages larger than 1 kB and should not receive messages back larger than 128 bytes.
For small messages, the IPC overhead is about 0.02ms. This is small enough to be inconsequential in the real world.

It is important to realize that the serialization of the message is a synchronous, blocking call; if the overhead is too large, your entire node process will be frozen while the message is sent. This means I/O will be starved and you will be unable to process any other events (like incoming HTTP requests). So what is the maximum amount of data that can be sent over node IPC?

graph

Things get really nasty over 32 kB. (These are per-message; double to get roundtrip overhead.)

The moral of the story is that you should:

If the input is larger than 32 kB, find a way to have your worker fetch the actual dataset. If you're pulling the data from a database or some other network location, do the request in the worker. Don't have the master fetch the data and then try to send it in a message. The message should contain only enough information for the worker to do its job. Think of messages like function parameters.
If the output is larger than 32 kB, find a way to have the worker deliver the result outside of a message. Write to disk or send the socket to the worker so that you can respond directly from the worker process.

To What Extent Can One Rely on Writing to Disk When Sigterm Is Given