Cluster Documentation: Difference between revisions

From ITTC Help
Jump to navigation Jump to search
 
Line 62: Line 62:
[username@front1 ~]$ srun echo Hello World!
[username@front1 ~]$ srun echo Hello World!
</pre>
</pre>
'''''Note: Once you have a job running on a node, you can SSH directly to it from either of the front nodes. There is no need to start multiple jobs to get multiple shells on a particular node.'''''


Job scripts use parameters (denoted by <code>#SBATCH</code>) in the script file to requested job resources, while interactive jobs request resources with command line parameters. When no resources are requested, a default set is automatically allocated for the job.  
Job scripts use parameters (denoted by <code>#SBATCH</code>) in the script file to requested job resources, while interactive jobs request resources with command line parameters. When no resources are requested, a default set is automatically allocated for the job.  
'''''Note: Once you have a job running on a node, you can SSH directly to it from either of the front nodes. There is no need to start multiple jobs to get multiple shells on a particular node.'''''


<div id="DefaultResources">
<div id="DefaultResources">

Latest revision as of 13:26, 28 February 2025

Introduction

The I2S Research Cluster is located in the ACF, and provides HPC resources to members of the center. The cluster uses the Slurm workload manager, which is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for Linux clusters. The cluster is composed of a variety of hardware types, with core counts ranging from 16 to 32 cores per node. In addition, there is specialized hardware including Nvidia graphics cards for GPU computing, Infiniband for low latency/high throughput parallel computing, and large memory systems with up to 512 GB of RAM.

Getting Help

You must have an ITTC/I2S Account and be part of the cluster group. If you have any questions about the I2S Research Cluster, feel free to email clusterhelp@ittc.ku.edu for assistance.

Connecting

Open OnDemand

For new users, we'd recommend visiting https://ondemand.ittc.ku.edu and logging in with your I2S credentials. Open OnDemand is a project from Ohio State University which provides a user-friendly interface to a HPC cluster. You can upload/download files, submit batch jobs, view running jobs, view cluster state, and get access to Interactive Sessions on compute nodes, all in your web browser.

For more information on Interactive Sessions, scroll down to #Open_OnDemand_Desktop. For more information on the Job Composer, scroll down to #Submitting_Batch_Jobs_from_OnDemand.

SSH

This connection method leaves you with a terminal window accepting only keyboard input in a command line interface. This is the simplest way to use the cluster, if you are comfortable with the command line.

From a Linux or Mac, you can simply run this from a terminal window:

ssh <your i2s username here>@front1.ittc.ku.edu

In Windows, you can either use Putty or enter the command above into Windows Terminal.

The above example uses the front1 front end, but we'd recommend users to use the front2 front end as well to balance out usage.

You'll be left with a shell on the front-end node. Do not run your computationally-expensive tasks directly on these nodes, since they are shared with all of the cluster users. Instead, create an allocation on a compute node, detailed in the Job Submission Guide.

Available front-end node hostnames

front1.ittc.ku.edu
front2.ittc.ku.edu

Campus

If you are connecting from any of the University of Kansas campuses, you may connect to either host directly.

Off-Campus

If you wish to connect to any of the I2S clusters from off campus, you must connect to KU Anywhere (VPN) before attempting to connect to the cluster. More information is available at KU Anywhere.

Job Submission Guide

Submitting Batch Jobs from OnDemand

Click on the "Jobs" tab, then click on "Job Composer." This should take you to the Job Composer page. To create a new job, click on "New Job," then click on "From Default Template." This will give you a simple batch job which you can extend upon. You can edit this job using the editor. Once you are ready to launch it, you can click "Submit" and the job will be queued to run.

After the job runs, you should see a slurm output file in the Folder Contents of the job. You can click on this to launch the editor to see your job's output.

PBS/Torque and Slurm

A translation for common PBS/Torque commands to Slurm commands can be found here. This provides a quick guide for those who are familiar with PBS/Torque, but new to the Slurm scheduler.

Submitting Jobs

To submit jobs to the cluster, you can either write a script and submit it using sbatch:

[username@front1 ~]$ sbatch script.sh

Or, you can submit jobs interactively from the command line using srun:

[username@front1 ~]$ srun echo Hello World!

Note: Once you have a job running on a node, you can SSH directly to it from either of the front nodes. There is no need to start multiple jobs to get multiple shells on a particular node.

Job scripts use parameters (denoted by #SBATCH) in the script file to requested job resources, while interactive jobs request resources with command line parameters. When no resources are requested, a default set is automatically allocated for the job.

This default resource set includes :
  • The job's name is set the same as the script file name, or, if the job was started with srun, then the job name is the same as the first command (in the case of the example above, the name would be set to 'echo').
  • The job is scheduled in the default intel queue.
  • The job is allocated 1 core on 1 node with 2GB of memory.
  • The job is allocated 1 day to run.
  • The job redirects stdout and stderr to the same output file if the job is submitted with sbatch. If srun is used, then both will be printed to the screen.
  • The job's output file name takes the form "slurm-jobid.out", and is created in the same directory as the job script.

srun

srun can be used to run any single task on a cluster node, but it is most useful for launching interactive GUI or bash sessions. Here is an srun example run on front1:

[username@front1 ~]$ srun -p intel -N 1 -n 1 -c 4 --mem 4G --pty /bin/bash
[username@n097 ~]$ 
The options used in this example are all detailed below:
-p
Specifies a partition, or queue to create the job in. The current cluster partitions available are intel, amd, bigm, and gpu. For more information on the cluster queues, see the partitions section below.
-N
This sets the number of requested nodes for the interactive session.
-n
Specifies the number of tasks or processes to run on each allocated node.
-c
Sets the number of requested cpus per task
--mem
This specifies the requested memory per node. Memory amounts can be given in Kilobytes (K), Megabytes (M), and Gigabytes (G).
--pty
This option puts the srun session in pseudo-terminal mode. It is recommended to use this option if you are running an interactive shell session.
/bin/bash
The last option in an srun invocation is the program that srun will execute on the requested node. In this case, bash is specified to start an interactive shell session.

srun is used to submit both interactive and non-interactive jobs. When it is run directly on the command line as shown above, an interactive session is started on a cluster node. When it is used in a job submission script, it starts a non-interactive session.

sbatch

sbatch is used to submit jobs to the cluster using a script file. Below is an example job submission script:

#!/bin/bash
#SBATCH -p intel
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 1
#SBATCH --mem=1GB
#SBATCH -t 00:20:00 
#SBATCH -J test_job
#SBATCH -o slurm-%j.out
 
echo "Job ${SLURM_JOB_ID} ran on ${HOSTNAME}"

Example output:

[username@front1 ~]$ sbatch test_job.sh
[username@front1 ~]$ cat slurm-47491.out
Job 47491 ran on n097
[username@front1 ~]$

This script requests one node with one core, and 1GB of memory. -J is used to specify the job name that appears in the job queue, while -o specifies the log file name for the job. %j in the job output file name is replaced with the Slurm job id when the scheduler processes the script. The variable SLURM_JOB_ID used in the example output is an environment variable set by the Slurm scheduler for each job.

To run this example script, copy its contents into a file in your home directory (test_job.sh for example). Log in to either front1.ittc.ku.edu or front2.ittc.ku.edu with your ITTC credentials, and run the command sbatch test_job.sh. The job output log will be saved in the same directory as the job submission script, and should contain similar output to the example above.

sbatch job scripts can run programs directly, as show above, but it is also possible to use srun within job submission scripts to run programs. Using srun in a job script allows for fine-grained resource control over parallel tasks run in a job script. An example is shown below:

#!/bin/bash
#SBATCH -p intel
#SBATCH -N 1
#SBATCH -n 2
#SBATCH -c 1
#SBATCH --mem=2GB
#SBATCH -t 00:20:00 
#SBATCH -J test_job
#SBATCH -o slurm-%j.out

srun -n 1 --mem=1G echo "Task 1 ran" &
srun -n 1 --mem=1G echo "Task 2 ran" &

wait

When the sbatch script is submitted, both srun invocations will run at the same time, splitting the resources requested at the top of the script file. This method is useful for launching a small number of related jobs at once from the same script, but does not scale well with a large number of jobs. The Job Array section below goes into more depth on running large numbers of parallel jobs on the cluster.

When using srun within a job submission script, you need to specify what portion of the resources each srun invocation is allocated. If more resources are requested by srun than are made available by the #SBATCH parameters, then some jobs may wait to run, or attempt to share resources with already running jobs. In the example above, two tasks and 2GB of memory are requested. In the srun commands below the resource request, we specify how much memory and how many tasks are allocated to each job.

The sbatch options shown in these example scripts are just the tip on the iceberg in terms of what is available. For the full listing of sbatch parameters, see the official Slurm sbatch documentation

Here is a brief list of other common options that may be useful:
-C
Specifies a node constraint. This can be used to specify cpu architecture, and instruction set.
-D
Specifies the path to the log file destination directory. This can be an absolute path, or a relative path from the job submission script directory.
--gres
Used to request GPU resources. See this example for more information on running GPU jobs.
--cores-per-socket
Sets the requested number of cores per cpu socket.
--mem-per-cpu
This specifies the memory allocated to each cpu in the interactive session. It has the same memory specification syntax as --mem.
--mail-type
Sets when the user is to be mailed job notifications. NONE, BEGIN, END, FAIL, REQUEUE, TIME_LIMIT, TIME_LIMIT_90, TIME_LIMIT_80, and TIME_LIMIT_50 are all valid options
--mail-user
Specifies the user account to email when job notification emails are sent.

Job Arrays

Submitting a large number of cluster jobs at once has two general approaches. The first is to submit jobs to the scheduler using srun in a loop on the command line. The preferable, and more powerful approach uses job arrays to submit large blocks of jobs all at once with the sbatch command.

The --array parameter for sbatch allows the scheduler to queue up hundreds to thousands of jobs with the same resource requests. This method is much less taxing on the cluster scheduler, and simplifies the process of submitting a large number of jobs all at once. These arrays usually consist of the same program fed different parameters dictated by the job array indicies.

An example job array script is shown below:

#!/bin/bash
#SBATCH -p intel
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 1
#SBATCH --mem=1G
#SBATCH -t 00:20:00
#SBATCH -J test_job
#SBATCH -o logs/%A_%a.out
#SBATCH --array=1-4

echo Job ${SLURM_ARRAY_TASK_ID} used $(awk "NR == ${SLURM_ARRAY_TASK_ID} {print \$0}" ${SLURM_SUBMIT_DIR}/parameters)

Example output:

[username@front1 ~]$ sbatch array_test.sh
[username@front1 ~]$ cd logs/
[username@front1 logs]$ ls
49219_1.out  49219_2.out  49219_3.out  49219_4.out
[username@front1 logs]$ cat *
Job 1 used line 1 parameters
Job 2 used line 2 parameters
Job 3 used line 3 parameters
Job 4 used line 4 parameters
[username@front1 logs]$

Parameters file:

line 1 parameters
line 2 parameters
line 3 parameters
line 4 parameters

In this example, the %A and %a symbols in the job log file path are replaced by the scheduler with the job array id and job array indicie respectively for each job in the array. The --array option specifies the creation of a job array which consists of four identical jobs with indices ranging from 1 to 4. Each job in the array is created with the same resource request at the top of the file, and runs the same bash command at the bottom of the script file. The echo command prints out the SLURM_ARRAY_TASK_ID (or job array indicie) environment variable of each job, along with one line from a file called "parameters". The awk command within the echo selects the line in the parameters file with the line number that matches the job array indicie value. This technique can be used to feed in specific parameters to different jobs within a job array.

Another way of generating program parameters for job arrays is through arithmetic. For example, if you wanted to define a minimum and maximum value a job needed to loop through based on its indicie value, in your job script, you may include something like this:

MAX=$(echo "${SLURM_ARRAY_TASK_ID} * 1000" | bc)
MIN=$(echo "$({SLURM_ARRAY_TASK_ID} - 1) * 1000" | bc)

for (( i=$MIN; i<$MAX; i++ )); do
  # Perform calculations...
done

Cluster Partitions

Cluster partitions, or queues, are sets of nodes in the cluster grouped by their features. Currently, there are four partition in the ITTC cluster: intel, amd, bigm, and gpu. The intel and amd partitions are made up of nodes that contain exclusively intel and amd cpus respectively. The bigm queue is made up of nodes with RAM from 256 to 500GB, and the gpu partition contains nodes with Nvidia gpu co-processors. Partitions can be specified in a job script with the -p option:

#SBATCH -p intel

They can also be specified in interactive sessions:

srun -p intel -N 1 -n 1 --pty /bin/bash

Partitions allow for high-level constraints on job hardware, but lack fine-grained control over things like cpu and gpu architecture.

Job Constraints

Job constraints allow precise specification for what hardware a job should run on. Cpu architectures and instruction sets can be requested, as well as the networking type, node manufacturer, and memory. Specifying hardware constraints is done with the -C option:

#SBATCH -C "intel"

Multiple constraints can also be specified at once:

srun -C "intel&ib" --pty /bin/bash

In this example, the & symbol between the two constraints specifies that both should be fulfilled for the job to run. The | symbol can also be used to specify that either one or the other constraint can be fulfilled. Additionally, square-brackets can be used to group together constraints. Here is an example combining all three:

#SBATCH -C "[intel&ib]|[amd&eth_10g]"

Available constraints:

    Instruction Set
    • sse3
    • sse4_1
    • sse4_2
    • sse4a
    • avx
    CPU Brand/Cores
    • intel
    • amd
    • intel8
    • amd8
    • intel12
    • intel16
    • intel20
    Networking
    • ib
    • ib_ddr
    • ib_qdr
    • noib
    • eth_10g
    Manufacturer/CPU Brand/Cores/Memory
    • del_int_8_16
    • del_int_8_24
    • del_int_12_24
    • asu_int_12_32
    • sup_int_12_32
    • asu_int_12_128
    • del_int_16_64
    • del_int_16_256
    • del_int_20_256
    • del_int_16_512
    • del_int_20_128
    • del_amd_8_16

GPU Jobs

Instead of using hardware constraints, GPUs are specified with Generic Resource (gres) requests. Below is an example of an interactive GPU job request:

srun -p gpu --gres="gpu:k20:2" --pty /bin/bash

This request specifies two Nvidia K20 GPUs in the GPU queue for the interactive session, along with the default job resources. The --gres option allows the specification of a the GPU model and number through a colon-delimited list. Below is a job script example:

#SBATCH -p gpu
#SBATCH --gres="gpu:k40:1"

The GPU partition must be specified when requesting GPUs, otherwise the scheduler will reject the job. Whenever a job is started on a GPU node, the environment variable CUDA_VISIBLE_DEVICES is set to contain a comma-delimited list of the GPUs allocated to the current job. Information about these GPUs can be viewed by running nvidia-smi.

Here is example output from the srun example above:

[username@front1 ~]$ srun -p gpu --gres="gpu:k20:2" --pty /bin/bash
[username@g002 ~]$ echo $CUDA_VISIBLE_DEVICES
1,2
[username@g002 ~]$ nvidia-smi
Fri Jan 20 16:23:01 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.48                 Driver Version: 367.48                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K20m          Off  | 0000:02:00.0     Off |                    0 |
| N/A   30C    P0    47W / 225W |      0MiB /  4742MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K20m          Off  | 0000:03:00.0     Off |                    0 |
| N/A   29C    P0    47W / 225W |      0MiB /  4742MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla K20m          Off  | 0000:83:00.0     Off |                    0 |
| N/A   28C    P0    48W / 225W |      0MiB /  4742MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla K20m          Off  | 0000:84:00.0     Off |                    0 |
| N/A   28C    P0    51W / 225W |      0MiB /  4742MiB |    100%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
[username@g002 ~]$ 

Currently, there are seven different GPU models available in the cluster:

    Gpu Models:
    • k20
    • k40
    • k80
    • titanxp
    • p100
    • v100s
    • titanrtx
    • a100

MPI Jobs

If you want to use multiple processors with MPI, you need to request multiple tasks with the -n option. You will also need to load a version of MPI (ex: OpenMPI). An example MPI job script is shown below:

#!/bin/bash

#SBATCH -p intel
#SBATCH -n 4
#SBATCH --mem=1GB
#SBATCH -t 00:05:00
#SBATCH -J mpi_example
#SBATCH -o slurm-%j.out

module load openmpi
mpirun $HOME/helloworld

The script above will launch four tasks of helloworld. Below is the example output:

Hello world from processor n097.local, rank 0 out of 4 processors
Hello world from processor n097.local, rank 1 out of 4 processors
Hello world from processor n097.local, rank 2 out of 4 processors
Hello world from processor n097.local, rank 3 out of 4 processors

GUI Access

Open OnDemand Desktop

This is the new, recommended way of getting GUI access to the I2S cluster. Go to https://ondemand.ittc.ku.edu in your web browser, and sign in with your I2S credentials. To get GUI access on a node, click on "I2S Cluster Desktop" and update the page to reflect the resources you need. The "Account" field is not active as of now, so you can leave that blank. The "Partition" field defaults to the intel partition.

If you are wanting to get onto a GPU node, you will need to change the partition to "gpu" (without quotes) and put your desired GPU in the gres field. For example, if you wanted one Titan XP, you would put gpu:titanxp:1 in the gres field.

Once you are satisfied, hit the "Launch" button to queue your request. You can monitor the state of your request under the "My Interactive Sessions" tab. Once your session is running, you can click "Launch I2S Cluster Desktop" to view an XFCE desktop on the node, in your browser. From here, you can open the terminal and launch whatever graphical application you like.

Open OnDemand has a few other Interactive Applications set up, besides the I2S Cluster Desktop. "Code Server" launches a VSCode-like interface in your browser from a cluster node. The "MATLAB" application will automatically launch MATLAB in a minimal desktop environment on a cluster node. If you want to run a GUI application that is not VSCode or MATLAB, you can launch it inside the generic I2S Cluster Desktop.

X11 forwarding

Access to a GUI running on the cluster may be accomplished with X11 forwarding. Data from the remote application is sent over ssh to an X server running locally. Each additional ssh connection between the local machine and the cluster must be started with X11 forwarding enabled. To request an interactive shell with X11 forwarding, you can use the "--x11" option. The following steps assume that the local machine has an X server running.

  1. Login via ssh to front1 or front2. Make sure your local ssh client has X11 forwarding enabled. If you are using ssh on the command line, add the "-X" flag to your ssh command.
  2. Start an interactive session with X11 forwarding. Be sure to request the number of cores, amount of memory, and walltime to complete your job. Syntax:
    srun --x11 -N 1 -n 2 --mem=4096mb -t 8:00:00 --pty /bin/bash
  3. After starting an interactive session with X11 forwarding, you can now launch graphical programs from the terminal.

RDP

Remote Desktop Protocol provides a user with a graphical interface to connect to another computer over a network connection. You will need a RDP client installed and will need to be connected to the KU Anywhere VPN. You can RDP to either front1.ittc.ku.edu or front2.ittc.ku.edu.

General Cluster Information

Software Environment

All cluster nodes run Rocky Linux 9.4 with GCC version 11.4.1. Cluster applications are installed as modules in /moosefs/apps/9/arch/generic/spack/modulefiles/linux-rocky9-x86_64.

Environment Modules

Cluster software is made available through environment modules. A list of available modules can be viewed by running:

module avail

Modules shown in the list can be loaded with the following command:

module load module_name

In order to persist loaded modules between interactive sessions, you need to add module load commands for the applications you want loaded to your ~/.bash_profile file if you are using bash, or ~/.cshrc if you are using tcsh or csh.

To view all loaded modules in your current shell session, use the module list command. To unload all currently loaded modules, you can use the module purge command. For more information on the module command and its options, see the documentation for further detail.

Installing Your Own Software

If you need a specific version of a particular package, or you need to install lots of your own Python packages, we would recommend the use of miniconda, spack, or Charliecloud. For users who do not have strict compiler/mpi requirements, miniconda is probably the easiest option.

To ensure your independence from our modules system, we would strongly recommend installing your choice of package manager to your /home or /scratch folder.

If you're looking to use a package manager like apt, yum, or dnf to install packages, we recommend using Charliecloud to build out your own containers. We do not give out root privileges on the cluster, but Charliecloud uses user namespaces and seccomp to allow you to use package managers as if you had root privileges.

Installing miniconda

Miniconda is a miniature installation of Anaconda that only contains conda, python, their dependencies, and a few other small packages.

You can see if the package you wish to use is available here: https://anaconda.org

This can be installed while in your scratch folder with the following line:

$ wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

From here, you can install miniconda using:

$ bash Miniconda3-latest-Linux-x86_64.sh

Do note this may default install Miniconda3 in your home directory, not your scratch folder.
You will be prompted to specify the location if this is something you wish to change.

Now you can configure this base install of conda using the following string of commands:

$ source [PATH_TO_MINICONDA]/etc/profile.d/conda.sh
$ conda config --system --set default_threads 4
$ conda config --system --prepend channels conda-forge
$ conda config --system --set channel_priority strict
$ conda config --system --set auto_update_conda False

With conda configured, you can now create an environment using the following:

$ conda create -n [Environment Name]

In order to install a package in this environment, you can activate it, then install, e.g.:

$ conda activate [Environment Name]
$ conda install [Package Name]
$ conda deactivate

Optionally, you can specify a package to install at the same time as creating the environment:

$ conda create -n [Environment Name] [package name found on anaconda site]

The python version and package version can also be specified, e.g.:

$ conda create -n myenv python=3.13 scipy=1.15.1

you can now load in this package with:

$ conda activate [Environment Name]

You may also want to look into creating an environment.yml file to describe your conda environment, this helps with reproducibility.


Charliecloud Quickstart

In this quickstart, we will use Charliecloud in an ad-hoc way to try to closely resemble the experience on a typical Ubuntu desktop. However, it is often preferred to write a Dockerfile and point ch-image towards it, for reproducibility's sake.

module load charliecloud

Pull the ubuntu image:

ch-image pull ubuntu
ch-image list

Now, you can copy the container to your current directory.

ch-convert ubuntu ./ubuntu

We do this because Charliecloud puts newly-pulled container images in /var/tmp. We do not want to keep our image here since /var/tmp will be wiped out by the end of our jobs. This command will copy the "ch-image" container in /var/tmp into a "dir" container in our current working directory.

To get a root shell inside the ./ubuntu dir container:

ch-run -w --uid 0 --seccomp ./ubuntu -- /bin/bash

The --seccomp option is important if you're wanting to install anything with apt or dnf.

Once you're inside the container, you can run the usual commands to update the index of packages, then upgrade. For demonstrations sake, we'll then install python3.

apt update
apt upgrade
apt install python3

Now, python3 has been installed into this container. You can continue to use the "dir" container with ch-run, but for performance reasons it is recommended to pack your container into a read-only squashfs image once you have what you need installed. You can use ch-convert to do this.

ch-convert ./ubuntu ubuntu.sqfs

This will put ubuntu.sqfs in your current directory. Now, you can run /bin/bash inside your newly created squashfs container, bind mounting /home. If you would like to bind /scratch to /scratch inside the container, you will need to create the mountpoint before you convert the container into a squashfs image.

ch-run ubuntu.sqfs -b /home:/home -- /bin/bash

And now you have a shell inside a custom, read-only Ubuntu container!

Copying NVIDIA libraries into Charliecloud container

If you need to make use of GPUs on the host from inside the container, you'll need to copy the nvidia libraries into it. Charliecloud comes with the ch-fromhost tool to do this.

ch-fromhost --nvidia <container dir, for example, ./ubuntu>

Once this finishes, you should be able to run nvidia-smi inside the container.

Filesystems

Below is a list of filesystems available on the cluster:

Path Description Default Quota
/home Personal Storage assigned to every user. 50GB
/scratch Private working storage to run cluster jobs. 1TB
/work Shared group storage. 1TB
/users Stores private home directories. Avoid running cluster jobs out of this directory. 5GB
/oldscratch /scratch from the old cluster, mounted read-only for easy transfer of files. 1TB
/oldwork /work from the old cluster, mounted read-only for easy transfer of files. 1TB
/tmp Local storage on cluster nodes. (purged at end of job) N/A

Debugging

The cluster has a number of tools at your disposal for debugging submitted Slurm jobs. The most basic debugging information available is from the log files generated by running your job, which contain the STDERR and STDOUT output from the job. Log files are located within the submit directory with the filename slurm-<job id>.out, such as slurm-49321.out.

You can retrieve detailed job information using the command scontrol show jobid -dd <jobid>. Likewise, if you want to view detailed job information while the job is running, add the --output option to srun in your job batch file. For an unbuffered stream of STDOUT, which is quite useful for debugging, add the -u or --unbuffered to srun in your job batch file.


Helpful Commands

The Slurm scheduler has a number of utilities for finding information on the status of your jobs. Below are listed a few of the most useful commands and options for quickly finding this information.

Useful Slurm commands:
sacct
Lists information on finished and currently running jobs, including job status and exit codes.
sacct -u <username>
Lists information on currently running and recently finished jobs for the specified user.
sacct -S <start-date> -s <state>
Lists all jobs that started before the start date or time that are in the specified state.
scancel -u <username> -t <state>
Cancels all of the jobs for the specific user that are in the specified state.
scontrol hold <jobid>
Suspends the specifed job by putting it in a 'HOLD' state.
scontrol resume <jobid>
Resumes the specified job from the 'HOLD' state.
scontrol show job <jobid>
Shows detailed queue and resource allocation information for the specified job.
sinfo
Displays information on all of the cluster partitions, including the nodes available in them.
sinfo -T
Shows information on cluster node reservations, including reservation period, name, and reserved nodes.
squeue
Displays the short-form information for all currently running and queued jobs.
squeue -u <username> -l
Lists the long-form information about currently running jobs for a specific user.
squeue -u <username> -t <state>
Lists information about a specific users jobs that are in the specified state.
sview
If X11 forwarding is enabled, this command launches a graphical interface for viewing cluster information.

MMICC NVIDIA L40S GPU node

Requirement

You must have an ITTC/I2S Account and be part of the i2s-mmicc-gpu group. Dr. Suzanne Shontz manages this group.

Submitting Jobs on MMICC NVIDIA L40S GPU node

Instead of using hardware constraints, GPU L40 GPUs are specified with Generic Resource (gres) requests. Below is an example of an interactive GPU job request on mmicc partition with one L40s GPU:

srun -p mmicc -c 8 --mem 16G --gres="gpu:l40s:1" --pty /bin/bash
The options used in this example are all detailed below:
-p
Specifies a partition
-c
Sets the number of requested cpus per task
--mem
This specifies the requested memory per node. Memory amounts can be given in Kilobytes (K), Megabytes (M), and Gigabytes (G).
--gres
option allows the specification of a the GPU model and number through a colon-delimited list
--pty
This option puts the srun session in pseudo-terminal mode. It is recommended to use this option if you are running an interactive shell session.
/bin/bash
The last option in an srun invocation is the program that srun will execute on the requested node. In this case, bash is specified to start an interactive shell session.

Note that that the default time limit for a job in the MMICC partition is 3 hours. Jobs that run past their allotted time will be stopped. To request more time, you'll need to use the -t parameter. This command will create a job in the mmicc partition with an 8 hour time limit:

srun -c 8 --gres=gpu:l40s:1 --mem=8G -p mmicc -t 08:00:00 --pty /bin/bash

You can verify your new time limit like this:

[username@g030 ~]$ squeue --long
Thu Oct 17 15:42:06 2024
             JOBID PARTITION     NAME     USER    STATE       TIME TIME_LIMI  NODES NODELIST(REASON)
             46093     mmicc     bash username  RUNNING       1:34   8:00:00      1 g030


The GPU partition must be specified when requesting GPUs, otherwise the scheduler will reject the job. Whenever a job is started on a GPU node, the environment variable CUDA_VISIBLE_DEVICES is set to contain a comma-delimited list of the GPUs allocated to the current job. Information about these GPUs can be viewed by running nvidia-smi. Here is example output from the srun example above:

[username@front1 scratch]$ srun -p mmicc -c 8 --mem 16G --gres="gpu:l40s:2" --pty /bin/bash
[username@g030 scratch]$ echo $CUDA_VISIBLE_DEVICES
0,1
[username@g030 scratch]$ nvidia-smi
Tue Oct 15 22:33:50 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.28.03              Driver Version: 560.28.03      CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA L40S                    On  |   00000000:0D:00.0 Off |                    0 |
| N/A   24C    P8             22W /  350W |       1MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA L40S                    On  |   00000000:B5:00.0 Off |                    0 |
| N/A   26C    P8             24W /  350W |       1MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

sbatch on MMICC NVIDIA L40S GPU node

sbatch is used to submit jobs to the cluster using a script file. The --gres option allows the specification of the L40 GPU and number through a colon-delimited list. Below is a job script example:

#!/bin/bash
#SBATCH -p mmicc
#SBATCH --gres=gpu:l40s:1
#SBATCH -c 4
#SBATCH --mem=8GB
#SBATCH -t 00:20:00 
#SBATCH -J test_job
#SBATCH -o slurm-%j.out
 
echo "Job ${SLURM_JOB_ID} ran on ${HOSTNAME}"

Example Output

[username@front1 ~]$ sbatch test_job.sh
[username@front1 ~]$ cat slurm-47491.out
Job 47491 ran on g030

Cluster Hardware

Visit the Cluster Hardware page for a complete listing of all of the nodes in the cluster and their hardware configurations.