Linux Batch Jobs in Parallel

linux batch jobs in parallel

You could check how many are currently running and start more if you have less than 7:

while true; do
    if [ "`ps ax -o comm | grep process-name | wc -l`" -lt 7 ]; then
        process-name &
    fi
    sleep 1
done

How do you run multiple programs in parallel from a bash script?

To run multiple programs in parallel:

prog1 &
prog2 &

If you need your script to wait for the programs to finish, you can add:

wait

at the point where you want the script to wait for them.

Run several jobs parallelly and Efficiently

As Mark Setchell says: GNU Parallel.

find scripts/ -type f | parallel

If you insists on keeping 8 CPUs free:

find scripts/ -type f | parallel -j-8

But usually it is more efficient simply to use nice as that will give you all 48 cores when no one else needs them:

find scripts/ -type f | nice -n 15 parallel

To learn more:

Watch the intro video for a quick introduction:
https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
Walk through the tutorial (man parallel_tutorial). You command line
with love you for it.

parallel running of jobs in unix

The cd commands don't seem to be a good idea; you don't cd into a directory. You'll probably also want to append the date information to the output file, rather than always clobber it. It also seems more likely that you'd copy the map files to the directory you just created. So, you might write:

(mkdir flex01; cp *.map flex01; echo "Job 1: $(date)" >> out) &
(mkdir flex02; cp *.map flex02; echo "Job 2: $(date)" >> out) &
(mkdir flex03; cp *.map flex03; echo "Job 3: $(date)" >> out) &
(mkdir flex04; cp *.map flex04; echo "Job 4: $(date)" >> out) &

wait

This runs each sequence of commands as a separate background job, and then waits for them all to finish before proceeding. You could look at using a loop for this task, too.

for n in $(seq 1 4)
do
    (mkdir flex0$n; cp *.map flex0$n; echo "Job $n: $(date)" >> out) &
done

You could also consider using mkdir -p flex01 so you don't get error messages when trying to create a directory that already exists. (Or you could test for errors and not copy if it exists, or test for existence before running mkdir, or clean it out before copying if it already exists, or ...)

Parallelise a Shell script without waiting for a batch to finish

The desired queue behavior, (but not necessarily the CPU assigning), can be made to work with command grouping like so:

{ ./hello<params1.txt && ./hello<params3.txt ; } &
{ ./hello<params2.txt && ./hello<params4.txt ; }

Demo of something like the above:

{ { echo a && sleep 2 && echo b ; } & 
  { echo c && sleep 1 && echo d ; } } | tr '\n' ' ' ; echo

Output:

a c d b

Running parallel jobs in slurm

Try adding --exclusive to the srun command line:

srun --exclusive --ntasks=1 python FINAL_ARGPARSE_RUN.py --n_division 30 --start_num ${num} &

This will instruct srun to use a sub-allocation and work as you intended.

Note that the --exclusive option has a different meaning in this context than if used with sbatch.

Note also that different versions of Slurm have a distinct canonical way of doing this, but using --exclusive should work across most versions.

Running shell script in parallel

Check out bash subshells, these can be used to run parts of a script in parallel.

I haven't tested this, but this could be a start:

#!/bin/bash
for i in $(seq 1 1000)
do
   ( Generating random numbers here , sorting  and outputting to file$i.txt ) &
   if (( $i % 10 == 0 )); then wait; fi # Limit to 10 concurrent subshells.
done
wait

Bash: limit the number of concurrent jobs?

If you have GNU Parallel http://www.gnu.org/software/parallel/ installed you can do this:

parallel gzip ::: *.log

which will run one gzip per CPU core until all logfiles are gzipped.

If it is part of a larger loop you can use sem instead:

for i in *.log ; do
    echo $i Do more stuff here
    sem -j+0 gzip $i ";" echo done
done
sem --wait

It will do the same, but give you a chance to do more stuff for each file.

If GNU Parallel is not packaged for your distribution you can install GNU Parallel simply by:

$ (wget -O - pi.dk/3 || lynx -source pi.dk/3 || curl pi.dk/3/ || \
   fetch -o - http://pi.dk/3 ) > install.sh
$ sha1sum install.sh | grep 883c667e01eed62f975ad28b6d50e22a
12345678 883c667e 01eed62f 975ad28b 6d50e22a
$ md5sum install.sh | grep cc21b4c943fd03e93ae1ae49e28573c0
cc21b4c9 43fd03e9 3ae1ae49 e28573c0
$ sha512sum install.sh | grep da012ec113b49a54e705f86d51e784ebced224fdf
79945d9d 250b42a4 2067bb00 99da012e c113b49a 54e705f8 6d51e784 ebced224
fdff3f52 ca588d64 e75f6033 61bd543f d631f592 2f87ceb2 ab034149 6df84a35
$ bash install.sh

It will download, check signature, and do a personal installation if it cannot install globally.

Watch the intro videos for GNU Parallel to learn more:
https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1

Run no more than 4 processes at a time in parallel in shell

Install the moreutils package in Ubuntu, then use the parallel utility:

parallel -j 4 ./sim -r -- 1 2 3 4 5 6 7 8 ...

Linux Batch Jobs in Parallel