Bash concurrent jobs gets stuck
Here, we have up to 6 parallel bash processes calling download_data
, each of which is passed up to 16 URLs per invocation. Adjust per your own tuning.
Note that this expects both bash (for exported function support) and GNU xargs.
#!/usr/bin/env bash
# ^^^^- not /bin/sh
download_data() {
echo "link #$2 [$1]" # TODO: replace this with a job that actually takes some time
}
export -f download_data
<input.txt xargs -d $'\n' -P 6 -n 16 -- bash -c 'for arg; do download_data "$arg"; done' _
Bash: limit the number of concurrent jobs?
If you have GNU Parallel http://www.gnu.org/software/parallel/ installed you can do this:
parallel gzip ::: *.log
which will run one gzip per CPU core until all logfiles are gzipped.
If it is part of a larger loop you can use sem
instead:
for i in *.log ; do
echo $i Do more stuff here
sem -j+0 gzip $i ";" echo done
done
sem --wait
It will do the same, but give you a chance to do more stuff for each file.
If GNU Parallel is not packaged for your distribution you can install GNU Parallel simply by:
$ (wget -O - pi.dk/3 || lynx -source pi.dk/3 || curl pi.dk/3/ || \
fetch -o - http://pi.dk/3 ) > install.sh
$ sha1sum install.sh | grep 883c667e01eed62f975ad28b6d50e22a
12345678 883c667e 01eed62f 975ad28b 6d50e22a
$ md5sum install.sh | grep cc21b4c943fd03e93ae1ae49e28573c0
cc21b4c9 43fd03e9 3ae1ae49 e28573c0
$ sha512sum install.sh | grep da012ec113b49a54e705f86d51e784ebced224fdf
79945d9d 250b42a4 2067bb00 99da012e c113b49a 54e705f8 6d51e784 ebced224
fdff3f52 ca588d64 e75f6033 61bd543f d631f592 2f87ceb2 ab034149 6df84a35
$ bash install.sh
It will download, check signature, and do a personal installation if it cannot install globally.
Watch the intro videos for GNU Parallel to learn more:
https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
bash script to run a constant number of jobs in the background
With GNU xargs:
printf '%s\0' j{1..6} | xargs -0 -n1 -P3 sh -c './"$1"' _
With bash (4.x) builtins:
max_jobs=3; cur_jobs=0
for ((i=0; i<6; i++)); do
# If true, wait until the next background job finishes to continue.
((cur_jobs >= max_jobs)) && wait -n
# Increment the current number of jobs running.
./j"$i" & ((++cur_jobs))
done
wait
Note that the approach relying on builtins has some corner cases -- if you have multiple jobs exiting at the exact same time, a single wait -n
can reap several of them, thus effectively consuming multiple slots. If we wanted to be more robust, we might end up with something like the following:
max_jobs=3
declare -A cur_jobs=( ) # build an associative array w/ PIDs of jobs we started
for ((i=0; i<6; i++)); do
if (( ${#cur_jobs[@]} >= max_jobs )); then
wait -n # wait for at least one job to exit
# ...and then remove any jobs that aren't running from the table
for pid in "${!cur_jobs[@]}"; do
kill -0 "$pid" 2>/dev/null && unset cur_jobs[$pid]
done
fi
./j"$i" & cur_jobs[$!]=1
done
wait
...which is obviously a lot of work, and still has a minor race. Consider using xargs -P
instead. :)
How do you run multiple programs in parallel from a bash script?
To run multiple programs in parallel:
prog1 &
prog2 &
If you need your script to wait for the programs to finish, you can add:
wait
at the point where you want the script to wait for them.
Run no more than 4 processes at a time in parallel in shell
Install the moreutils
package in Ubuntu, then use the parallel
utility:
parallel -j 4 ./sim -r -- 1 2 3 4 5 6 7 8 ...
How do I kill background processes / jobs when my shell script exits?
To clean up some mess, trap
can be used. It can provide a list of stuff executed when a specific signal arrives:
trap "echo hello" SIGINT
but can also be used to execute something if the shell exits:
trap "killall background" EXIT
It's a builtin, so help trap
will give you information (works with bash). If you only want to kill background jobs, you can do
trap 'kill $(jobs -p)' EXIT
Watch out to use single '
, to prevent the shell from substituting the $()
immediately.
Wait for bash background jobs in script to be finished
There's a bash
builtin command for that.
wait [n ...]
Wait for each specified process and return its termination sta‐
tus. Each n may be a process ID or a job specification; if a
job spec is given, all processes in that job’s pipeline are
waited for. If n is not given, all currently active child pro‐
cesses are waited for, and the return status is zero. If n
specifies a non-existent process or job, the return status is
127. Otherwise, the return status is the exit status of the
last process or job waited for.
Exit a bash script if an error occurs in it or any of the background jobs it creates
Collect the PIDs for the background jobs; then, use wait
to collect the exit status of each, exiting the first time any PID polled over in that loop is nonzero.
install_pids=( )
for dir in ./projects/**/; do
(cd "$dir" && exec npm install) & install_pids+=( $! )
done
for pid in "${install_pids[@]}"; do
wait "$pid" || exit
done
The above, while simple, has a caveat: If an item late in the list exits nonzero prior to items earlier in the list, this won't be observed until that point in the list is polled. To work around this caveat, you can repeatedly iterate through the entire list:
install_pids=( )
for dir in ./projects/**/; do
(cd "$dir" && exec npm install) & install_pids+=( $! )
done
while (( ${#install_pids[@]} )); do
for pid_idx in "${!install_pids[@]}"; do
pid=${install_pids[$pid_idx]}
if ! kill -0 "$pid" 2>/dev/null; then # kill -0 checks for process existance
# we know this pid has exited; retrieve its exit status
wait "$pid" || exit
unset "install_pids[$pid_idx]"
fi
done
sleep 1 # in bash, consider a shorter non-integer interval, ie. 0.2
done
However, because this polls, it incurs extra overhead. This can be avoided by trapping SIGCHLD and referring to jobs -n
(to get a list of jobs whose status changed since prior poll) when the trap is triggered.
Related Topics
Difference Between The Commands "Gcloud Compute Ssh" and "Ssh"
Error While Trying to Run Make Command
How to Stop Page Cache for Disk I/O in My Linux System
How to Find/Cut for Only The Filename from an Output of Ls -Lrt in Perl
Passing a Command with Arguments as a String to Docker Run
A Wrong Size of "Len" Calculated by $ - Symbol with Fasm Equ
How to Issue "Module Load" in a Shell or Perl Script (I.E., Non-Interactively)
Brother Ql-720Nw Specifying Media Size Seems Ignored
Search Ip from a Text File in .Csv Log File, If Found Add New Column Next to It
How Does Bash Script Command Substitution Work
Prevalence of 64Bit Vs 32Bit Platforms
Keep Ssh Sessions Running After Disconnection
How to Search The Content of a PDF File in Linux Shell Script
How to Write Content to File on Linux Sftp Server Using Sshclient
Why Does Automating Sftp with Expect Hang After Sending The Password