How to Kill Zombie Process

How to kill zombie process

A zombie is already dead, so you cannot kill it. To clean up a zombie, it must be waited on by its parent, so killing the parent should work to eliminate the zombie. (After the parent dies, the zombie will be inherited by pid 1, which will wait on it and clear its entry in the process table.) If your daemon is spawning children that become zombies, you have a bug. Your daemon should notice when its children die and wait on them to determine their exit status.

An example of how you might send a signal to every process that is the parent of a zombie (note that this is extremely crude and might kill processes that you do not intend. I do not recommend using this sort of sledge hammer):

# Don't do this.  Incredibly risky sledge hammer!
kill $(ps -A -ostat,ppid | awk '/[zZ]/ && !a[$2]++ {print $2}')

How do you kill zombie process using wait()


How do you know (and) where to put the "wait()" statement to kill
zombie processes?

If your parent spawns only a small, fixed number of children; does not care when or whether they stop, resume, or finish; and itself exits quickly, then you do not need to use wait() or waitpid() to clean up the child processes. The init process (pid 1) takes responsibility for orphaned child processes, and will clean them up when they finish.

Under any other circumstances, however, you must wait() for child processes. Doing so frees up resources, ensures that the child has finished, and allows you to obtain the child's exit status. Via waitpid() you can also be notified when a child is stopped or resumed by a signal, if you so wish.

As for where to perform the wait,

  • You must ensure that only the parent wait()s.
  • You should wait at or before the earliest point where you need the child to have finished (but not before forking), OR
  • if you don't care when or whether the child finishes, but you need to clean up resources, then you can periodically call waitpid(-1, NULL, WNOHANG) to collect a zombie child if there is one, without blocking if there isn't any.

In particular, you must not wait() (unconditionally) immediately after fork()ing because parent and child run the same code. You must use the return value of fork() to determine whether you are in the child (return value == 0), or in the parent (any other return value). Furthermore, the parent must wait() only if forking was successful, in which case fork() returns the child's pid, which is always greater than zero. A return value less than zero indicates failure to fork.

Your program doesn't really need to wait() because it spawns exactly four (not three) children, then exits. However, if you wanted the parent to have at most one live child at any time, then you could write it like this:

int main() {
pid_t child;
int i;

printf("-----------------------------------\n");
about("Parent");

for (i = 0; i < 3; i++) {
printf("Now .. Forking !!\n");
child = fork();

if (child < 0) {
perror ("Unable to fork");
break;
} else if (child == 0) {
printf ("In child #%d\n", (i+1));
about ("Child");
break;
} else {
/* in parent */
if (waitpid(child, NULL, 0) < 0) {
perror("Failed to collect child process");
break;
}
}
}

return 0;
}

If the parent exits before one or more of its children, which can happen if it does not wait, then the child will thereafter see its parent process being pid 1.

Others have already answered how to get a zombie process list via th ps command. You may also be able to see zombies via top. With your original code you are unlikely to catch a glimpse of zombies, however, because the parent process exits very quickly, and init will then clean up the zombies it leaves behind.

how to kill zombie processes created by multiprocessing module?

A couple of things:

  1. Make sure the parent joins its children, to avoid zombies. See Python Multiprocessing Kill Processes

  2. You can check whether a child is still running with the is_alive() member function. See http://docs.python.org/2/library/multiprocessing.html#multiprocessing.Process

Creating A Zombie Process Using the kill Function

Here is a simple recipe which should create a zombie:

#include <stdio.h>
#include <signal.h>
#include <unistd.h>

int main()
{
int pid = fork();
if(pid == 0) {
/* child */
while(1) pause();
} else {
/* parent */
sleep(1);
kill(pid, SIGKILL);
printf("pid %d should be a zombie\n", pid);
while(1) pause();
}
}

The key is that the parent -- i.e. this program -- keeps running but does not do a wait() on the dying child.

Zombies are dead children that have not been waited for. If this program waited for its dead child, it would go away and not be a zombie. If this program exited, the zombie child would be inherited by somebody else (probably init), which would probably do the wait, and the child would go away and not be a zombie.

As far as I know, the whole reason for zombies is that the dead child exited with an exit status, which somebody might want. But where Unix stores the exit status is in the empty husk of the dead process, and how you fetch a dead child's exit status is by waiting for it. So Unix is keeping the zombie around just to keep its exit status around just in case the parent wants it but hasn't gotten around to calling wait yet.

So it's actually kind of poetic: Unix's philosophy here is basically that no child's death should go unnoticed.

Killing zombie processes in Linux

A zombie process consumes only the entry in the process table. The kernel maintains it to allow for the parent process' wait(2) system call (and family) to be aware that there's actually a process to be waited for and don't fail about calling wait() without having subprocesses walking around. Those walking dead processes are there to ensure kernel data consistency and, as such, you cannot kill them (even as root user) The only way to ensure that a living parent doesn't have this bunch of zombies around is to do one wait(2) call for each fork() it has done before (which you don't do at all). As in your code the thread is going to die just after closing the file descriptor, you have a chance to do a waitpid(pid_of_child, ...); there, so you'll wait for the proper child. See waitpid(2) for more info about this system call. This approach will have a non-visible drawback (your thread will last until the child dies). The reason this works normally with processes (the non-need to do wait() in the parent process) is that you are not dying the parent (the parent lives after the thread dies) and so, the fork()/wait() relationship maintains. When a parent dies, the kernel makes init (process with id == 1) the parent of your process, and init(8) is always making wait(2)s for the orphaned children in the system.

By just adding the following code after

    ...
close(connfd); /* parent closes connected socket */
int retcode; /* return code of child process */
waitpid(childpid, &retcode, 0);
} /* for loop */

or, as you are not going to check how did the child terminate

    ...
close(connfd); /* parent closes connected socket */
waitpid(childpid, 0, 0);
} /* for loop */

This has another drawback, is that you are going to wait for the child to terminate and will not get into the accept(2) system call before your child terminates, which can be not what you want. If you want to avoid creating child zombie processes, there's another alternative (but it has some other drawbacks) is to ignore the SIGCHLD signal in the whole process, which makes the kernel not create those zombies (legacy way to do, there are other ways to avoid zombie children) or you can have a new thread just making the needed wait()s and dispatch the returned values from the children to the proper place, once they die.



Related Topics



Leave a reply



Submit