Multithreading semaphore for bash script (sub-processes)
Following recommendation by @Mark Setchell, using GNU Parallel to replace the loop (in a simulated cron environment (see https://stackoverflow.com/a/2546509/8236733)) with
bcpexport() {
filename=$1
TO_SERVER_ODBCDSN=$2
DB=$3
TABLE=$4
USER=$5
PASSWORD=$6
RECOMMEDED_IMPORT_MODE=$7
DELIMITER=$8 # DO NOT use format like "'\t'", nested quotes seem to cause hard-to-catch error
<same code from original loop>
}
export -f bcpexport
parallel -j 10 bcpexport \
::: $DATAFILES/$TARGET_GLOB \
::: "$TO_SERVER_ODBCDSN" \
::: $DB \
::: $TABLE \
::: $USER \
::: $PASSWORD \
::: $RECOMMEDED_IMPORT_MODE \
::: $DELIMITER
to run at most 10 threads at a time, where $DATAFILES/$TARGET_GLOB
is a glob string to return all of the files in the desired dir. (eg. "$storagedir/tsv/*.tsv") that we want to go through (and adding the remaining fixed args with each of the elements returned by that glob as the remaining parallel inputs shown) (The $TO_SERVER_ODBCDSN
variable is actually "-D -S <some ODBC DSN>
", so needed to add quotes to pass as single arg). So if the $DATAFILES/$TARGET_GLOB
glob returns files A, B, C, ..., we end up running the commands
bcpexport A "$TO_SERVER_ODBCDSN" $DB ...
bcpexport B "$TO_SERVER_ODBCDSN" $DB ...
bcpexport C "$TO_SERVER_ODBCDSN" $DB ...
...
in parallel. An additionally nice thing about using parallel
is
GNU parallel makes sure output from the commands is the same output as you would get had you run the commands sequentially.
How to create Multiple Threads in Bash Shell Script
That's not a thread, but a background process. They are similar but:
So, effectively we can say that threads and light weight processes are same.
The main difference between a light weight process (LWP) and a normal process is that LWPs share same address space and other resources like open files etc. As some resources are shared so these processes are considered to be light weight as compared to other normal processes and hence the name light weight processes.
NB: Redordered for clarity
What are Linux Processes, Threads, Light Weight Processes, and Process State
You can see the running background process using the jobs
command. E.g.:
nick@nick-lt:~/test/npm-test$ sleep 10000 &
[1] 23648
nick@nick-lt:~/test/npm-test$ jobs
[1]+ Running
You can bring them to the foreground using fg
:
nick@nick-lt:~/test/npm-test$ fg 1
sleep 1000
where the cursor will wait until the sleep time has elapsed. You can pause the job when it's in the foreground (as in the scenario after fg 1
) by pressing CTRL-Z
(SIGTSTP
), which gives something like this:
[1]+ Stopped sleep 1000
and resume it by typing:
bg 1 # Resumes in the background
fg 1 # Resumes in the foreground
and you can kill it by pressing CTRL-C
(SIGINT
) when it's in the foreground, which just ends the process, or through using the kill command with the %
affix to the jobs
ID:
kill %1 # Or kill <PID>
Onto your implementation:
BROWSERS=
for i in "${@}"; do
case $i in
-b)
shift
BROWSERS="$1"
;;
*)
;;
esac
done
IFS=',' read -r -a SPLITBROWSERS <<< "$BROWSERS"
for browser in "${SPLITBROWSERS[@]}"
do
echo "Running ${browser}..."
$browser &
done
Can be called as:
./runtests.sh -b firefox,chrome,ie
Tadaaa.
Multithreading in Bash
Sure, just add &
after the command:
read_cfg cfgA &
read_cfg cfgB &
read_cfg cfgC &
wait
all those jobs will then run in the background simultaneously. The optional wait
command will then wait for all the jobs to finish.
Each command will run in a separate process, so it's technically not "multithreading", but I believe it solves your problem.
How do I use parallel programming/multi threading in my bash script?
The simplest way is to execute the commands in the background, by adding &
to the end of the command:
#!/bin/bash
#script to loop through directories to merge fastq files
sourcedir=/path/to/source
destdir=/path/to/dest
for f in $sourcedir/*
do
fbase=$(basename "$f")
echo "Inside $fbase"
zcat $f/*R1*.fastq.gz | gzip > $destdir/"$fbase"_R1.fastq.gz &
zcat $f/*R2*.fastq.gz | gzip > $destdir/"$fbase"_R2.fastq.gz &
done
From the bash manual:
If a command is terminated by the control operator ‘&’, the shell executes the command asynchronously in a subshell. This is known as executing the command in the background. The shell does not wait for the command to finish, and the return status is 0 (true). When job control is not active (see Job Control), the standard input for asynchronous commands, in the absence of any explicit redirections, is redirected from /dev/null.
bash while loop threading
You can send tasks to the background by &
If you intend to wait for all of them to finish you can use the wait
command:
process_to_background &
echo Processing ...
wait
echo Done
You can get the pid
of the given task started in the background if you want to wait for one (or few) specific tasks.
important_process_to_background &
important_pid=$!
while i in {1..10}; do
less_important_process_to_background $i &
done
wait $important_pid
echo Important task finished
wait
echo All tasks finished
On note though: the background processes can mess up the output as they will run asynchronously. You might want to use a named pipe to collect the output from them.
edit
As asked in the comments there might be a need for limiting the background processes forked. In this case you can keep track of how many background processes you've started and communicate with them through a named pipe.
mkfifo tmp # creating named pipe
counter=0
while read ip
do
if [ $counter -lt 10 ]; then # we are under the limit
{ check $ip; echo 'done' > tmp; } &
let $[counter++];
else
read x < tmp # waiting for a process to finish
{ check $ip; echo 'done' > tmp; } &
fi
done
cat /tmp > /dev/null # let all the background processes end
rm tmp # remove fifo
Related Topics
How to Use Code Completion into Eclipse with Opencv
While Do Loop and Variables in a Bash Script
Bash Script to Find and Display Oldest File
How to Ssh Multiple Hops Without Putting the Local Rsa Key Everywhere
The Difference Between Initrd and Initramfs
Error While Loading Shared Libraries: Libncurses.So.5:
Explanation of Convertor of Cidr to Netmask in Linux Shell Netmask2Cdir and Cdir2Netmask
How to Parse Netstat Command in Order to Get Process Name and Pid from It
After Changing /Etc/Profile, What Do I Have to Do to Reset My Shell
Qimage to Cv::Mat Convertion Strange Behaviour
Join on First Column of Two Files
Qt Creator: Add Qt Module to Project
How to Wrap Lines Within Columns in Linux
Need an Overview of Debugging Process from the Hardware Layer
Shell Script Current Directory
Determine Target Isa Extensions of Binary File in Linux (Library or Executable)