Getting Started in the Terminal
Logging in
These instructions are for accessing and submitting jobs through the Terminal.
If you would prefer to use a GUI, follow the instructions for getting started online
To login to your Tempest account through the Terminal, connect to the MSU VPN then use the following command:
ssh <net_id>@tempest-login.msu.montana.edu
If this is your first time using SSH to access Tempest, follow these instructions to set up an SSH key.
Submitting sbatch scripts
Tempest uses a cluster management platform called Slurm for managing jobs.
All significant computational workloads must be submitted as jobs. Jobs can be submitted using sbatch scripts, which specify the computational resources and software a job will use.
To submit a job, use the sbatch command. For example, to run the job in example.sbatch script in ~/slurm-examples/, you would use the following command:
sbatch slurm-examples/example.sbatch
To check the status of your jobs:
sacct
Below are template submission scripts that can be modified to submit diverse jobs to CPU and GPU resources:
CPU Sbatch script
For submission to priority resources:
#!/bin/bash
##
## example-array.slurm.sh: submit an array of jobs with a varying parameter
##
## Lines starting with #SBATCH are read by Slurm. Lines starting with ## are comments.
## All other lines are read by the shell.
##
#SBATCH --account=priority-<group_name> #specify the account to use
#SBATCH --job-name=<your_job_name> # job name
#SBATCH --partition=priority # queue partition to run the job in
#SBATCH --nodes=1 # number of nodes to allocate
#SBATCH --ntasks-per-node=1 # number of descrete tasks - keep at one except for MPI
#SBATCH --cpus-per-task=2 # number of cores to allocate
#SBATCH --mem=2G # 2000 MB of Memory allocated; set --mem with care
#SBATCH --time=0-00:00:01 # Maximum job run time
##SBATCH --array=1-3 # Number of jobs in array
#SBATCH --output=<your_job_name>-%j.out
#SBATCH --error=<your_job_name>-%j.err
## Run 'man sbatch' for more information on the options above.
### Replace the below with modules and commands
date # print out the date
hostname -s # print a message from the compute node
date # print the date again
For submission to unsafe partition:
#!/bin/bash
##
## example-array.slurm.sh: submit an array of jobs with a varying parameter
##
## Lines starting with #SBATCH are read by Slurm. Lines starting with ## are comments.
## All other lines are read by the shell.
##
#SBATCH --job-name=<your_job_name> # job name
#SBATCH --partition=unsafe # queue partition to run the job in
#SBATCH --nodes=1 # number of nodes to allocate
#SBATCH --ntasks-per-node=1 # number of descrete tasks - keep at one except for MPI
#SBATCH --cpus-per-task=2 # number of cores to allocate
#SBATCH --mem=2G # 2000 MB of Memory allocated; set --mem with care
#SBATCH --time=0-00:00:01 # Maximum job run time
##SBATCH --array=1-3 # Number of jobs in array
#SBATCH --output=<your_job_name>-%j.out
#SBATCH --error=<your_job_name>-%j.err
## Run 'man sbatch' for more information on the options above.
### Replace the below with modules and commands
date # print out the date
hostname -s # print a message from the compute node
date # print the date again
For submission to nextgen partition:
#!/bin/bash
##
## example-array.slurm.sh: submit an array of jobs with a varying parameter
##
## Lines starting with #SBATCH are read by Slurm. Lines starting with ## are comments.
## All other lines are read by the shell.
##
#SBATCH --job-name=<your_job_name> # job name
#SBATCH --partition=nextgen # use nextgen-long for jobs greater than 3 days
#SBATCH --nodes=1 # number of nodes to allocate
#SBATCH --ntasks-per-node=1 # number of descrete tasks - keep at one except for MPI
#SBATCH --cpus-per-task=2 # number of cores to allocate
#SBATCH --mem=2G # 2000 MB of Memory allocated; set --mem with care
#SBATCH --time=0-00:00:01 # Maximum job run time
##SBATCH --array=1-3 # Number of jobs in array
#SBATCH --output=<your_job_name>-%j.out
#SBATCH --error=<your_job_name>-%j.err
## Run 'man sbatch' for more information on the options above.
### Replace the below with modules and commands
date # print out the date
hostname -s # print a message from the compute node
date # print the date again
GPU Sbatch script
For submission to priority partition:
#!/bin/bash
##
## gpuexample.sbatch submit a job using a GPU
##
## Lines starting with #SBATCH are read by Slurm. Lines starting with ## are comments.
## All other lines are read by the shell.
##
#SBATCH --account=priority-<group_name> # priority account to use
#SBATCH --job-name=<your_job_name> # job name
#SBATCH --partition=gpupriority # queue partition to run the job in
#SBATCH --nodes=1 # number of nodes to allocate
#SBATCH --ntasks-per-node=1 # number of descrete tasks - keep at one except for MPI
#SBATCH --cpus-per-task=8 # number of cores to allocate - do not allocate more than 16 cores per GPU
#SBATCH --gpus-per-task=1 # number of GPUs to allocate - all GPUs are currently A40 model
#SBATCH --mem=2000 # 2000 MB of Memory allocated - do not allocate more than 128000 MB mem per GPU
#SBATCH --time=1-00:10:00 # Maximum job run time (d-hh:mm:ss)
#SBATCH --output=<your_job_name>-%j.out # standard output file (%j = jobid)
#SBATCH --error=<your_job_name> -%j.err # standard error file
## Run 'man sbatch' for more information on the options above.
### Replace the below with modules and commands
module load CUDA/11.1.1-GCC-10.2.0
echo "You are using CUDA version: "
nvcc --version
For submission to unsafe partition:
#!/bin/bash
##
## gpuexample.sbatch submit a job using a GPU
##
## Lines starting with #SBATCH are read by Slurm. Lines starting with ## are comments.
## All other lines are read by the shell.
##
#SBATCH --job-name=<your_job_name> # job name
#SBATCH --partition=gpuunsafe # queue partition to run the job in
#SBATCH --nodes=1 # number of nodes to allocate
#SBATCH --ntasks-per-node=1 # number of descrete tasks - keep at one except for MPI
#SBATCH --cpus-per-task=8 # number of cores to allocate - do not allocate more than 16 cores per GPU
#SBATCH --gpus-per-task=1 # number of GPUs to allocate - all GPUs are currently A40 model
#SBATCH --mem=2000 # 2000 MB of Memory allocated - do not allocate more than 128000 MB mem per GPU
#SBATCH --time=1-00:10:00 # Maximum job run time (d-hh:mm:ss)
#SBATCH --output=<your_job_name>-%j.out # standard output file (%j = jobid)
#SBATCH --error=<your_job_name>-%j.err # standard error file
## Run 'man sbatch' for more information on the options above.
### Replace the below with modules and commands
module load CUDA/11.1.1-GCC-10.2.0
echo "You are using CUDA version: "
nvcc --version
For submission to nextgen GPU partition:
#!/bin/bash
##
## gpuexample.sbatch submit a job using a GPU
##
## Lines starting with #SBATCH are read by Slurm. Lines starting with ## are comments.
## All other lines are read by the shell.
##
#SBATCH --job-name=<your_job_name> # job name
#SBATCH --partition=nextgen-gpu # use nextgen-gpu-long for jobs greater than 3 days
#SBATCH --nodes=1 # number of nodes to allocate
#SBATCH --ntasks-per-node=1 # number of descrete tasks - keep at one except for MPI
#SBATCH --cpus-per-task=8 # number of cores to allocate - do not allocate more than 16 cores per GPU
#SBATCH --gpus-per-task=1 # number of GPUs to allocate - all GPUs are currently A40 model
#SBATCH --mem=2000 # 2000 MB of Memory allocated - do not allocate more than 128000 MB mem per GPU
#SBATCH --time=1-00:10:00 # Maximum job run time (d-hh:mm:ss)
#SBATCH --output=<your_job_name>-%j.out # standard output file (%j = jobid)
#SBATCH --error=<your_job_name>-%j.err # standard error file
## Run 'man sbatch' for more information on the options above.
### Replace the below with modules and commands
module load CUDA/11.1.1-GCC-10.2.0
echo "You are using CUDA version: "
nvcc --version