How to find from where a job is submitted in SLURM?
You can use the scontrol
command to see the job details. $ scontrol show job <jobid>
For example, for a running job on our SLURM cluster:
$ scontrol show job 1665191
JobId=1665191 Name=tasktest
...
Shared=OK Contiguous=0 Licenses=(null) Network=(null)
Command=/lustre/work/.../slurm_test/task.submit
WorkDir=/lustre/work/.../slurm_test
You are looking for the last line, WorkDir
.
How to get original location of script used for SLURM job?
You can get the initial (i.e. at submit time) location of the submission script from scontrol
like this:
scontrol show job $SLURM_JOBID | awk -F= '/Command=/{print $2}'
So you can replace the realpath $0
part with the above. This will only work within a Slurm allocation of course. So if you want the script to work in any situation, you will need some logic like:
if [ -n $SLURM_JOB_ID ] ; then
THEPATH=$(scontrol show job $SLURM_JOBID | awk -F= '/Command=/{print $2}')
else
THEPATH=$(realpath $0)
fi
and then proceed with
SHARED_PATH=$(dirname $(dirname "${THEPATH}"))
How can I find out the command (batch script filename) of a finished SLURM job?
Slurm does not indeed store the command in the accounting database. Two workarounds:
For a single user: use the JobName
or Comment
to store the script name upon submission. These are stored in the database, but this approach is error-prone;
Cluster-wise: enable job completion plugin to ElastiSearch as this stores not only the script name but the whole contents as well.
Slurm job, knowing what node it is on
A simple, yet effective, and often used, way to write in the job output on which node it ran is to add
srun hostname
to it. Also the job id is available from within the job script through environment variable SLURM_JOB_ID ; so you can use
sstat -j $SLURM_JOB_ID
in your slurm script to get the information you want.
Do submitted jobs take a copy the source? Queued jobs?
The sbatch
command creates a copy of the submission script and a snapshot of the environment and saves it in the directory listed as the StateSaveLocation
configuration parameter. It can therefore be changed after submission without effect.
But that is not the case for the files used in the submission script. If your submission script starts an executable, if will see the "version" of the executable at the time it starts.
Modifying the program before it starts will lead to the new version being run, modifying it during the run (i.e. while it has already been read from disk and saved into memory) will lead to the old version being run.
How can I get detailed job run info from SLURM (e.g. like that produced for standard output by LSF)?
At the end of each job I use to insert
sstat -j $SLURM_JOB_ID.batch --format=JobID,MaxVMSize
to add RAM usage to the standard output.
Related Topics
"Git Add" Returning "Fatal: Outside Repository" Error
How to Add a Custom Footer to Sphinx Documentation? (Restructuredtext)
Linux Command to Check New Files in File System
When to Use --Dynamic Option in Nm
On Linux - Set Maximum Open Files to Unlimited. Possible
How to Know the Interrupt/Gpio Number for a Specific Pin in Linux
How to Find My Shell Version Using a Linux Command
/Lib64/Ld-Linux-X86-64.So.2: No Such File or Directory Error
How to Make Travis Ci Test Package for Linux, Os X, Windows
How to Confirm Redhat Enterprise Linux Version
How to Automatically Pipe to Less If the Result Is More Than a Page on My Shell
Tool to Visualize the Device Tree File (Dtb) Used by the Linux Kernel
How to Take Ownership of a Hid Device
Kernel Stack for Linux Process
What Is the Access Time in Unix