What do the suffixes + and - after the job id of background jobs mean?
It's in the man-page for jobs under STDOUT:
> man jobs
The character '+' identifies the job that would be used as a default for the fg or bg utilities; this job can also be specified using the job_id %+ or "%%" . The character '-' identifies the job that would become the default if the current default job were to exit; this job can also be specified using the job_id %-.
So the job marked with '+' is the one that will be activated by 'fg'.
"+" and "-" output of the Jobs command [duplicate]
The character '+' identifies the job that would be used as default for the fg or bg utilities; this job can also be specified using the job_id %+ or "%%" . The character '-' identifies the job that would become the default if the current default job were to exit; this job can also be specified using the job_id %-.
From here: http://www.linuxquestions.org/questions/linux-software-2/output-of-jobs-command-563880/
How does Bash (and other shells) handle detaching STDIN when `bg`ing a stopped process?
My understanding:
- The shell is usually the session leader which allocates the controlling terminal.
- Each job (may have multiple processes, like
ls | wc -l
) is a process group. - Only the foreground pgrp can read from the controlling terminal. (The fg pgrp may have multiple processes and these processes CAN read from the controlling termianl at the same time.)
- The shell calls
tcsetpgrp()
to set the foreground pgrp (e.g. when we start a new job, or put a bg job back to fg withfg
). - It's the kernel (tty driver) who sends
SIGTTIN
to a background process which tries to read from the controlling terminal. - The shell does not know when a process would read from the controlling terminal. Instead, the shell monitors the job's status change. When a process is stopped by
SIGTTIN
, the shell would receiveSIGCHLD
and then it can callwaitpid()
to get more info.
Wait for all jobs of a user to finish before submitting subsequent jobs to a PBS cluster
Filling in following the solution suggested by Jonathan in the comments.
There are several resource managers based on the original Portable Batch System: OpenPBS, TORQUE and PBS Professional. The systems had diverged significantly and use different command syntax for newer features such as job arrays.
Job arrays are a convenient way to submit multiple similar jobs based on the same job script. Quoting from the manual:
Sometimes users will want to submit large numbers of jobs based on the
same job script. Rather than using a script to repeatedly call qsub, a
feature known as job arrays now exists to allow the creation of
multiple jobs with one qsub command.
To submit a job array PBS provides the following syntax:
qsub -t 0-10,13,15 script.sh
this submits jobs with ids from 0,1,2,...,10,13,15.
Within the script the variable PBS_ARRAYID
carries the id of the job within the array and can be used to pick the necessary configuration.
Job array have their specific dependency options.
TORQUE
TORQUE resource manager that is probably used in the OP. There additional dependency options are provided that can be seen in the following example:
$ qsub -t 1-1000 script.sh
1234[].pbsserver.domainname
$ qsub -t 1001-2000 -W depend=afterokarray:1234[] script.sh
1235[].pbsserver.domainname
This will result in the following qstat
output
1234[] script.sh user 0 R queue
1235[] script.sh user 0 H queue
Tested on torque version 3.0.4
The full afterokarray syntax is in the qsub(1)
manual.
PBS Professional
In PBS Professional dependencies can work uniformly on ordinary jobs and array jobs. Here is an example:
$ qsub -J 1-1000 -ry script.sh
1234[].pbsserver.domainname
$ qsub -J 1001-2000 -ry -W depend=afterok:1234[] script.sh
1235[].pbsserver.domainname
This will result in the following qstat
output
1234[] script.sh user 0 B queue
1235[] script.sh user 0 H queue
Update on Torque versions
Array dependencies became available in Torque since version 2.5.3. Job arrays from version 2.5 are not compatible with job arrays in versions 2.3 or 2.4. In particular the []
syntax was introduced in Torque since version 2.5.
Update on using a delimeter job
For torque versions prior to 2.5 a different solution may work that is based on submitting dummy delimeter jobs between batches of jobs to be separated.
It uses three dependency types: on
,before
, and after
.
Consider the following example
$ DELIM=`qsub -Wdepend=on:1000 dummy.sh `
$ qsub -Wdepend=beforeany:$DELIM script.sh
1001.pbsserver.domainname
... another 998 jobs ...
$ qsub -Wdepend=beforeany:$DELIM script.sh
2000.pbsserver.domainname
$ qsub -Wdepend=after:$DELIM script.sh
2001.pbsserver.domainname
...
This will result in the queue state like this
1000 dummy.sh user 0 H queue
1001 script.sh user 0 R queue
...
2000 script.sh user 0 R queue
2001 script.sh user 0 H queue
...
That is the job #2001 will run only after the previous 1000 jobs terminate. Probably the rudimentary job array facilities available in TORQUE 2.4 can be used as well to submit the script job.
This solution will also work for TORQUE version 2.5 and higher.
Related Topics
How to Avoid "No Such File or Directory" Error for 'Make Clean' Makefile Target
How to Prevent Matlab Printing False Space and Use Wrong Fonts
Termios Vmin Vtime and Blocking/Non-Blocking Read Operations
Pkill Returns 255 in Combination with Another Command via Remote Ssh
Difference Between Kernel, Kernel-Thread and User-Thread
Is There an Scp Variant of Mv Command
Linux Kernel Headers' Organization
Why /Lib32/Libc.So.6 Has Two "Fopen" Symbol in It
Interrupting Syscalls in Threads on Linux
Access Denied to Android.Git.Kernel.Org
How to Create a Zip File Without Entire Directory Structure
How to Access the Base Filename of a File You Are Sourcing in Bash
Core Dump Filename Gets Thread Name Instead of Executable Name with Core_Pattern %E.%P.Core
How to Use a Seq_File in Linux Kernel Modules
"Cannot Write to Log File Pg_Upgrade_Internal.Log" When Upgrading from Postgresql 9.1 to 9.3
Broadcasting Udp Packet to 255.255.255.255