#!/bin/bash
#PBS -N PBS_TEST
#PBS -o /data/PBS_TEST.out
#PBS -e /data/PBS_TEST.err
#PBS -q workq
#PBS -l mem=50mb
/usr/bin/hostname
/bin/sleep 10
Submit a PBS job qsub
When you are new to PBS, the place to start is the qsub
command, which submits jobs to your HPC systacems. The only jobs that the qsub
accepts are scripts, so you'll need to package your tasks appropriately. Here is a simple example script (pbs_test.job
):
The first line specified the shell to use in interpreting the script, while the next few lines starting with #PBS
are directives that are passed to PBS. The first names the job, the next two specify where output and error output go, the next to last identifies the queue that is used, and the last lists a resource that will be needed, in this case 100 MB
of memory. The blank line signals the end of PBS directives. Lines that follow the blank line indicate the actual job (print the server hostname
, which is slurmctld
).
Once you have created the batch script for your job, the qsub
command is used to submit the job. We have kept the job submission script in a shared folder called /data
. Let's submit the job and see the statistics with qstat
command.
qsub /data/pbs_test.job; qstat
Let's now look at the output:
cat /data/PBS_TEST.out
Options qsub
Some of the more commonly used qsub options are:
-q queue
: Select the queue to run the job in. The queues you can use are listed by running qstat.-l walltime=??:??:??
: The wall clock time limit for the job. Time is expressed in seconds as an integer, or in the form:[[hours:]minutes:]seconds[.milliseconds]
-l vmem=???MB
: The total (virtual) memory limit for the job - can be specified with units of "MB" or "GB" but only integer values can be given. There is a small default value. Your job will only run if there is sufficient free memory so making a sensible memory request will allow your jobs to run sooner. A little trial and error may be required to find how much memory your jobs are using.-l ncpus=?
: The number of cpus required for the job to run on.-l ncpus=N
: If the number of cpus requested, N, is small the job will run within a single shared memory node. If the number of cpus specified is greater, the job will be distributed over multiple nodes.-l ncpus=N:M
This form requests a total of N cpus with (a multiple of) M cpus per node. Note that-l
options maybe combined as a comma separated list with no spaces, e.g.,-l vmem=512mb,walltime=10:00:00
.
The newer PBS (PBS Pro), comes with the concept of resource chunking using a select
parameter, let's see another job submission script example:
#!/bin/bash
#PBS -P PBS_TEST
#PBS -q workq
#PBS -l select=1:ncpus=1:mpiprocs=1
#PBS -l place=scatter:excl
#PBS -l walltime=00:01:00
#PBS -o /data/PBS_TEST.out
#PBS -e /data/PBS_TEST.err
/usr/bin/hostanme
Where the line -l select=1:ncpus=1:mpiprocs=1
is the number of processors required for the MPI job. select specifies the number of nodes (or chunks of resource) required; ncpus
indicates the number of CPUs per chunk required; and mpiprocs
represents the number of MPI processes to run out of the CPUs selected (normally ncpus
would equal mpiprocs
).
As this is not the most intuitive command, the following table is provided as guidance as to how this command works:
select | ncpus | mpiprocs | Description |
2 | 16 | 16 | 32 Processor job, using 2 nodes and 16 processors per node |
4 | 8 | 8 | 32 Processor job, using 4 nodes and 8 processors per node |
16 | 1 | 1 | 16 Processor job, using 16 nodes running 1 mpi process per processor and utilising 1 processor per node |
8 | 16 | 16 | 128 Processor job, using 8 nodes and 16 processors per node (each running an mpi process) |
PBS interactive jobs
Use of PBS is not limited to batch jobs only. It also allows users to use the compute nodes interactively, when needed. For example, users can work with the developer environments provided by Matlab or R on compute nodes, and run their jobs (until the walltime expires).
Instead of preparing a submission script, users pass the job requirements directly to the qsub command:
qsub -I -X -q workq -l nodes=7:ppn=4,walltime=15:00:00,mem=2gb
Here, -I
(as in 'I'ndia) stands for 'interactive' and -X
allows for GUI applications. the PBS scheduler will allocate 7*4=28
cores to the user as soon as nodes with given specifications become available, then automatically log the user into one of the compute nodes. From now on, the user can work interactively using these cores until the walltime expires. Note that there should be no space between the parameters being passed to -l
(as in 'L'ima) flag, only commas!