How to Kill Zombie Processes Created by Multiprocessing Module

how to kill zombie processes created by multiprocessing module?

A couple of things:

  1. Make sure the parent joins its children, to avoid zombies. See Python Multiprocessing Kill Processes

  2. You can check whether a child is still running with the is_alive() member function. See http://docs.python.org/2/library/multiprocessing.html#multiprocessing.Process

Python Multiprocessing leading to many zombie processes

Usually the most common problem is that the pool is created but it is not closed.

The best way I know to guarantee that the pool is closed is to use a try/finally clause:

try:
pool = Pool(ncores)
pool.map(yourfunction, arguments)
finally:
pool.close()
pool.join()

If you don't want to struggle with multiprocessing, I wrote a simple package named parmap that wraps multiprocessing to make my life (and potentially yours) easier.

pip install parmap

import parmap
parmap.map(yourfunction, arguments)

From the parmap usage section:

  • Simple parallel example:

    import parmap
    y1 = [myfunction(x, argument1, argument2) for x in mylist]
    y2 = parmap.map(myfunction, mylist, argument1, argument2)
    y1 == y2
  • Iterating over a list of tuples:

    # You want to do:
    z = [myfunction(x, y, argument1, argument2) for (x,y) in mylist]
    z = parmap.starmap(myfunction, mylist, argument1, argument2)

    # You want to do:
    listx = [1, 2, 3, 4, 5, 6]
    listy = [2, 3, 4, 5, 6, 7]
    param = 3.14
    param2 = 42
    listz = []
    for (x, y) in zip(listx, listy):
    listz.append(myfunction(x, y, param1, param2))
    # In parallel:
    listz = parmap.starmap(myfunction, zip(listx, listy), param1, param2)

Zombie state multiprocessing library python3

Per this thread, Marko Rauhamaa writes:

If you don't care to know when child processes exit, you can simply ignore the SIGCHLD signal:

import signal
signal.signal(signal.SIGCHLD, signal.SIG_IGN)

That will prevent zombies from appearing.

The wait(2) man page explains:

POSIX.1-2001 specifies that if the disposition of SIGCHLD is set to
SIG_IGN or the SA_NOCLDWAIT flag is set for SIGCHLD (see
sigaction(2)), then children that terminate do not become zombies and
a call to wait() or waitpid() will block until all children have
terminated, and then fail with errno set to ECHILD. (The original
POSIX standard left the behavior of setting SIGCHLD to SIG_IGN
unspecified. Note that even though the default disposition of
SIGCHLD is "ignore", explicitly setting the disposition to SIG_IGN
results in different treatment of zombie process children.)

Linux 2.6 conforms to the POSIX requirements. However, Linux 2.4
(and earlier) does not: if a wait() or waitpid() call is made while
SIGCHLD is being ignored, the call behaves just as though SIGCHLD
were not being ignored, that is, the call blocks until the next child
terminates and then returns the process ID and status of that child.

So if you are using Linux 2.6 or a POSIX-compliant OS, using the above code will allow children processes to exit without becoming zombies. If you are not using a POSIX-compliant OS, then the thread above offers a number of options. Below is one alternative, somewhat similar to Marko Rauhamaa's third suggestion.


If for some reason you need to know when children processes exit and wish to
handle (at least some of them) differently, then you could set up a queue to
allow the child processes to signal the main process when they are done. Then
the main process can call the appropriate join in the order in which it receives
items from the queue:

import time
import multiprocessing as mp

def exe(i, q):
try:
print(i)
if i == 1:
time.sleep(10)
elif i == 10:
raise Exception('I quit')
else:
time.sleep(3)
finally:
q.put(mp.current_process().name)

if __name__ == '__main__':
procs = dict()
q = mp.Queue()
for i in range(1,20):
proc = mp.Process(target=exe, args=(i, q))
proc.start()
procs[proc.name] = proc

while procs:
name = q.get()
proc = procs[name]
print(proc)
proc.join()
del procs[name]

print("finished")

yields a result like

...    
<Process(Process-10, stopped[1])> # <-- process with exception still gets joined
19
<Process(Process-2, started)>
<Process(Process-4, stopped)>
<Process(Process-6, started)>
<Process(Process-5, stopped)>
<Process(Process-3, stopped)>
<Process(Process-9, started)>
<Process(Process-7, stopped)>
<Process(Process-8, started)>
<Process(Process-13, started)>
<Process(Process-12, stopped)>
<Process(Process-11, stopped)>
<Process(Process-16, started)>
<Process(Process-15, stopped)>
<Process(Process-17, stopped)>
<Process(Process-14, stopped)>
<Process(Process-18, started)>
<Process(Process-19, stopped)>
<Process(Process-1, started)> # <-- Process-1 ends last
finished

Python Multiprocessing Kill Processes

You need to .join() on your processes in a worker Queue, which will lock them to the calling application until all of them succeed or kill when the parent is killed, and run them in daemon mode.

http://forums.xkcd.com/viewtopic.php?f=11&t=94726

end daemon processes with multiprocessing module

http://docs.python.org/2/library/multiprocessing.html#the-process-class

http://www.python.org/dev/peps/pep-3143/#correct-daemon-behaviour



Related Topics



Leave a reply



Submit