How to run a job: Difference between revisions

From hpcwiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 9: Line 9:
  echo "Hello world!"
  echo "Hello world!"


This job does nothing for two minutes, after that it prints "Hello world!". The script starts with #PBS, which is a directive for Torque (the resource manager). In this example it means that you tell the queue manager to use one CPU (ppn:1) on one node (nodes=1) for this job. There can be more than one directive, they should always be at the start of the script. The rest of the script in this example are just commands that you could type on the command line.
This job does nothing for two minutes, after that it prints "Hello world!". The script starts with #PBS, which is a directive for Torque (the resource manager). In this example it means that you tell the queue manager to use one CPU (ppn:1) on one node (nodes=1) for this job. There can be more than one directive, they should always be at the start of the script. The remaing lines in this example are just commands that you could type on the command line.


Assuming that the name of the script is  it canbe submitted like this:
Assuming that the name of the script is  it can be submitted like this:


  qsub job1
  qsub job1


The qsub command responds with a job id, wich looks like this: <code>24.hpc10.hpc</code>. The part before the first dot is system wide unique number that's increase by 1 for every new job that is submitted.
The qsub command responds with a job id, wich looks like this: <code>24.hpc10.hpc</code>. The part before the first dot is system wide unique number that's increased by 1 for every new job that is submitted.
 
You can check if your job is running with the qstat command. This command by itself gives a list of all running jobs, the job you just submitted will probably be one of the last down the list.

Revision as of 13:22, 15 September 2016

First steps

In order to make the worker nodes run a parallel job, you have to prepare a job script. This script tells the queue manager what you want to do, it has to be submitted with the qsub command. A typical job script looks like this:

#
#PBS -l nodes=1:ppn=1
#
sleep 120
echo "Hello world!"

This job does nothing for two minutes, after that it prints "Hello world!". The script starts with #PBS, which is a directive for Torque (the resource manager). In this example it means that you tell the queue manager to use one CPU (ppn:1) on one node (nodes=1) for this job. There can be more than one directive, they should always be at the start of the script. The remaing lines in this example are just commands that you could type on the command line.

Assuming that the name of the script is it can be submitted like this:

qsub job1

The qsub command responds with a job id, wich looks like this: 24.hpc10.hpc. The part before the first dot is system wide unique number that's increased by 1 for every new job that is submitted.

You can check if your job is running with the qstat command. This command by itself gives a list of all running jobs, the job you just submitted will probably be one of the last down the list.