The most reliable way to terminate a family of processes
You may want to perform the killing (eventually via a script) in a different login shell to ensure you're not accidentally stopping/killing the very shell/script attempting to do the overall killing before it completes its job :)
The first key strategy is to not directly terminate a process, but to:
- just "freeze" it first (with
kill -STOP <pid>
) to prevent it from
spawning other children (needed to reliably determine its children,
otherwise you'll miss some as explained in this
Q&A:https://superuser.com/questions/927836/how-to-deal-with-a-memory-leaking-fork-bomb-on-linux/927967#927967) - add it to the list of processes to terminate (later)
- find the list of its children
- iterate the whole story on the children, rince repeat
Once the entire ancestry tree based on ppid is frozen you can start locating and freezing ancestries based on process groups - you can still determine these process groups reliably as long as the parents of the processes which changed their process group are still alive (since their ppid is not changed) - add these groups to a list of pgids to be nuked and freeze any new ppid-based process subtrees you may find in these groups like above:
- if their parents are still alive they should be frozen already as
they're in the frozen ppid-based ancestry tree - if they're orphans they will be killed when the entire pgid will be nuked
Related processes can be discovered by session ID in a manner very similar to the one based on group ID (except killing needs to be done by pid as the kill cmd supports a group ID but not a session ID).
Another way to find potentially related processes would be by their tty, if they have one. But with care - they might not be descendents of the process you want to kill but ancestors or sibblings. You can still freeze the ppid-based subtrees and groups you find this way while you investigate - you can always "thaw" them later (with kill -CONT
) if they don't need to be killed.
I don't know how to locate descendant process subtrees decoupled by a processes declaring themselves session leaders (thus changing both their sid and pgid) if their parents died and they have no pty.
Once the entire list of subtrees is frozen processes can be killed (by pid or pgid as needed) or thawed to continue their work if desired.
What's the best way to send a signal to all members of a process group?
You don't say if the tree you want to kill is a single process group. (This is often the case if the tree is the result of forking from a server start or a shell command line.) You can discover process groups using GNU ps as follows:
ps x -o "%p %r %y %x %c "
If it is a process group you want to kill, just use the kill(1)
command but instead of giving it a process number, give it the negation of the group number. For example to kill every process in group 5112, use kill -TERM -- -5112
.
How to kill all child processes after parent process termination?
Here's a probably more portable solution.
The fork(2)
system call will return the PID of your child processes, you can store the PIDs, and then you can use kill(2)
to send signal to the children and terminates them.
Notice that SIGKILL
and SIGTERM
signal may require some privileges of the parent process. If it doesn't have such privileges, you can send a SIGCONT
to the child process, and modify the SIGCONT
signal handler in your child process.
!!! Warning sign
From a signal handler using exit()
is not safe. I've just checked the manual man 7 signal
and found that it is not async safe. You can use _exit
, _Exit
or abort
Some pseudo code:
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
void* handler(int sig){
_exit(0);
}
int main()
{
pid_t children[6];
for(int i=0;i<6;i++) // loop will run 6 times(there are 6 child processes.)
{
if((children[i] = fork()) == 0)
{
signal(SIGCONT,handler);
printf("Started [son] pid %d from [parent] pid %d\n",getpid(),getppid());
sleep(10); //child waits 10 seconds,then it exitted.
printf("Exitted [son] pid %d from [parent] pid %d\n",getpid(),getppid());
exit(0);
}
}
//parent
sleep(5); //parent will wait 5 seconds than it will exit
for(int i=0;i<6;i++)
kill(children[I],SIGCONT);
printf("Parent terminated\n");
exit(0); //parent terminated.(how can I exit the the other 6 child processes too?)
}
How does Ctrl-C terminate a child process?
Signals by default are handled by the kernel. Old Unix systems had 15 signals; now they have more. You can check </usr/include/signal.h>
(or kill -l). CTRL+C is the signal with name SIGINT
.
The default action for handling each signal is defined in the kernel too, and usually it terminates the process that received the signal.
All signals (but SIGKILL
) can be handled by program.
And this is what the shell does:
- When the shell running in interactive mode, it has a special signal handling for this mode.
- When you run a program, for example
find
, the shell:fork
s itself- and for the child set the default signal handling
- replace the child with the given command (e.g. with find)
- when you press CTRL+C, parent shell handle this signal but the child will receive it - with the default action - terminate. (the child can implement signal handling too)
You can trap
signals in your shell script too...
And you can set signal handling for your interactive shell too, try enter this at the top of you ~/.profile
. (Ensure than you're a already logged in and test it with another terminal - you can lock out yourself)
trap 'echo "Dont do this"' 2
Now, every time you press CTRL+C in your shell, it will print a message. Don't forget to remove the line!
If interested, you can check the plain old /bin/sh
signal handling in the source code here.
At the above there were some misinformations in the comments (now deleted), so if someone interested here is a very nice link - how the signal handling works.
What exactly is Python multiprocessing Module's .join() Method Doing?
The join()
method, when used with threading
or multiprocessing
, is not related to str.join()
- it's not actually concatenating anything together. Rather, it just means "wait for this [thread/process] to complete". The name join
is used because the multiprocessing
module's API is meant to look as similar to the threading
module's API, and the threading
module uses join
for its Thread
object. Using the term join
to mean "wait for a thread to complete" is common across many programming languages, so Python just adopted it as well.
Now, the reason you see the 20 second delay both with and without the call to join()
is because by default, when the main process is ready to exit, it will implicitly call join()
on all running multiprocessing.Process
instances. This isn't as clearly stated in the multiprocessing
docs as it should be, but it is mentioned in the Programming Guidelines section:
Remember also that non-daemonic processes will be automatically be
joined.
You can override this behavior by setting the daemon
flag on the Process
to True
prior to starting the process:
p = Process(target=say_hello)
p.daemon = True
p.start()
# Both parent and child will exit here, since the main process has completed.
If you do that, the child process will be terminated as soon as the main process completes:
daemon
The process’s daemon flag, a Boolean value. This must be set before
start() is called.The initial value is inherited from the creating process.
When a process exits, it attempts to terminate all of its daemonic
child processes.
Related Topics
Store Passwords Required by a Linux Daemon
How to Use Xargs to Run a Function in a Command Substitution for Each Match
Finding Processor Id in Which Process Is Running [Through Command/Interface Similar to Top]
List of Available Wireless Networks with Golang (Under Linux)
How to Determine a Tar Archive's Format
Problems with Sudo Inside Expect Script
How to Get Complete Stack Dump from Profiler in Every Sample for Use in Flame Graph
Grep Array Parameter of Excluded Files
Sprof "Pltrel Not Found Error"
Oracle Query - Ora-01652: Unable to Extend Temp Segment But Only in Some Versions of Sql*Plus
How to Get "Instant" Output of "Tail -F" as Input
How to Make 'Docker Run' Inherit Ulimits
Configuring Tomat's Server.Xml File with Auto Generating Mod_Jk.Conf
.Dat Attachment Instead of Text Using Mailx in Redhat Linux
Which Is Faster of Two Case or If