How to use Torque/PBS

How the queue structure works

There are 4 queues ? the default is feed which is a routing queue. The feed queue then allocates the job to one of the other queue's based on the resources asked for.

The other 3 queues are called short, long and verylong.
Allocation to a queue is on the basis of expected cputime (cput)or wall clock time (walltime). Allocation is by checking the queues short, long and verylong in that order. A job will be allocated to the first queue that will accept it based on the resource requested. The queue's all have a maximum number of jobs that are allowed to be running simultaneously. Currently these values are set to 10 verylong, 40 long and 120 short. They will be adjusted from time to time to accommodate the mix of jobs being submitted.

  • The short queue has a maximum cput of 2 hours and a maximum walltime of 4 hours. It also has a default cput of 1 hour and a default walltime of 4 hours.
  • The long queue has a maximum cput of 30 hours and a maximum walltime of 50 hours. The defaults are 24 hours cput and 50 hours walltime.
  • The verylong queue is different ? it has a minimum cput and walltime of 24 hours and 1 second and a default walltime of 288 hours (12 days).

All the queues have a default nodes of 1 (ie each job will only use one node ? if you need to use more than one node refer to the mpi section).

Examples

To help explain what will happen when you submit a job:-

you submit a job with no request for resources. It will enter the feed queue where it will be routed to the short queue. The short queue will then apply its defaults of cput = 1 hour, walltime = 4 hours and nodes =1. The job will then be allocated to a node and run until either the job finishes, OR it reaches 1 hour of cpu time OR it reaches 4 hours of real time.

You submit a job with a request for 3 hours walltime. It will enter the feed queue where it will be routed to the short queue. The short queue will then apply its defaults of cput = 1 hour and nodes =1. The job will retain its requested 3 hours walltime. The job will then run until either the job finishes OR it reaches 1 hour of cpu time OR 3 hours of real time.

You submit a job with a request for 3 hours cput. It will enter the feed queue where it will be routed to the long queue. The long queue will then apply its defaults of walltime = 50 hours and nodes =1. The job will retain its requested 3 hours cput. The job will then run until either the job finishes OR it reaches 3 hours of cpu time OR 50 hours of real time.

You submit a job with a request for 3 hours cput and 200 hours walltime. It will enter the feed queue where it will be routed to the verylong queue. The verylong queue will then apply its default of nodes =1. The job will retain its requested 3 hours cput and 200 hours walltime. The job will then run until either the job finishes OR it reaches 3 hours of cpu time OR 200 hours of real time.

How to request resources

On the command line:

qsub -l nodes=1,walltime=00:05:00,cput=00:01:00 script.sh

this command submits the job script.sh with a request for 1 node, 1 minute of cpu time and 5 minutes of real time.

Alternatively you can have the request in the script itself at the beginning of the script put in comments as follows

#!/bin/sh

#PBS -l walltime=00:05:00

#PBS -l cput=00:01:00

#PBS -l nodes=1

#PBS -m abe

...

These requests are then passed to pbs when pbs accepts the job. Any qsub option can be entered in this way as the mail option -m abe shows above.

More information