TUNI TCSC Cluster
This guide is based on similar one made for the previous TUNI cluster Narvi and can be found from X Files of this Wiki (TUNI Narvi Cluster)
This document describes how to use the TUNI TCSC computing cluster.
What is TCSC?
- Narvi is the SLURM cluster that substituted the old Narvi cluster in 2025 (that replaced even older Merope cluster in 2017).
- The cluster is used by researchers, faculty members, and students at Tampere University.
- There are powerful CPU-only (e.g. Xeon) and GPU nodes (e.g. 4xV100 and 8xH200) - details about available resources can be found from the TCSC Web page.
How to Get an Account?
The user rights are granted per research team and maintained by the team leader (PI). Details about obtaining the rights is found from TUNI Intranet and it can take several days.
Once you have the accounts ready, you can access the computing resources through a dedicated Web interface or old-fashioned using terminal with ssh. Note that outside the campus network VPN connectin can be needed.
How to Check the Queue
To see the status of the queue, type
squeue
# for a specific partitions (e.g. `test`).
squeue -p test
# for a specific user
squeue -u <user>
Available job queues are listed in the internal TCSC wiki (access is limited).
How to Run a Job
Remember: do not use the login node for computation – it is slow and will degrade the performance of the login node for other users!
There are two common ways to run a job at a slurm cluster:
srunsbatch
The main difference is that srun is interactive which means the terminal will be attached to the current session. The experience is just like with any other command in your terminal. Note, that when the queue is full you will have to wait until you get resources.
If you use sbatch, you submit your job to the slurm queue and get your terminal back; you can disconnect, kill your terminal, etc. with no consequence. In the case of srun, killing the terminal would kill the job. Hence, sbatch is recommended.
Here is the example srun command which will ask the cluster to start an interactive shell with 1 GPU (--gres) and 10 CPUs (--cpus-per-task), 10 GB of RAM (--mem-per-cpu) that will be available to you for 30 minutes (--time):
srun \
--pty \
--job-name pepe_run \
--partition gpu \
--gres gpu:1 \
--mem-per-cpu 1G \
--ntasks 1 \
--cpus-per-task 10 \
--time 00:30:00 \
/bin/bash -i
and this is an example sbatch command which will ask the cluster to run my_script.sh with 1 GPU and 10 CPUs, 10 GB of RAM that will run for at most 30 minutes (if the script has finished execution the job will be ended), the output and error logs will be saved to log_JOBID.txt (--output, --error):
sbatch \
--job-name pepe_run \
--partition gpu \
--gres gpu:1 \
--mem-per-cpu 1g \
--ntasks 1 \
--cpus-per-task 10 \
--time 00:30:00 \
--output log_%j.txt \
--error log_%j.txt \
my_script.sh
--constraint='kepler|pascal|volta' in order to select a specific gpu architecture.
Instead of specifying the resources and other information as command-line arguments, you may find it useful to list them inside of my_script.sh and then just use sbatch my_script.sh:
#!/bin/bash
#SBATCH --job-name=pepe_run
#SBATCH --gres=gpu:1
#SBATCH --time=00:30:00
# and so on. To comment SBATCH entry use `##SBATCH --arg ...`
# here starts your script
To learn more sbatch hacks, a reader is also referred to this nice tutorial.
How to Cancel My Job
To cancel a specific job you are running, use
scancel <JobID>
How to Transfer Data?
The simplest way is to use scp command
scp -r ./folder user@xxxx.tuni.fi:/my/path/
-r means to copy the folder with all files in it.
However, once the internet connection is interrupted you will need to start all over again. To have an opportunity to resume the data transfer try rsync instead
rsync -ahP ./folder user@xxxx.tuni.fi:/my/path/
-ah means to preserve permissions symlinks, etc as in the original folder and h makes the progress "human-readable", and P allows to continue data transfer (sends missing files on the target path 🤓).
Trailing / in rsync makes the difference
rsync /dir1/dir2/ /home/dir3- copies the contents of/dir1/dir2but not thedir2folder itself.rsync /dir1/dir2 /home/dir3– copies the folderdir2along with all its contents.
If you would like to see the files from a remote machine you may mount the folder locally. On Ubuntu/Debian install sshfs and run this
mkdir narvi_folder
sshfs user@xxx.tuni.fi:/my/folder/ ./my_folder
/my/folder will be shown in ./my_folder. Mind that the changes in either folder will be reflected in another one.
To unmount the folder use
umount ./my_folder
How Do I Install My Software
Slurm modules (best option)
Before you install own software, check if the software you would like to install is already installed by the admin (e.g. matlab, cuda, and gcc). These are set up using module functionality. You can load a module by specifying module load <mod> inside of your script. To see all available modules run module avail.
Examples are given in the TCSC internal Wiki, and a good tutorial is provided by CSC in their Slurm Modules tutorial slide set.
If you are not satisfied with the selection you can install your own. Below are several options how to do that.
Using Conda (avoid if you can)
Conda allows you to
- Fully control what software packages are installed
- Have own environments for different projects
- Copy environments from your computer to a TCSC computing node
However, using conda is not recommended since
- Problems to use together with Slurm modules
- sbatch startup can be slow - read more from CSC best practices for Conda
conda Has Many Linux Tools
Besides a ton of Python packages, conda has surprisingly many common Linux tools, e.g. tmux, htop, ffmpeg, vim, and more. This is especially useful if you would like to install them but do not have sudo rights.
Creating a conda Environment
Minimal Conda is available in the miniforge package, log in the tcsc computer and load the module
module load miniforge3/24.9.0
Let's start by creating an empty conda environment
conda create --name my_env
Activate it (meaning that all binaries installed in this environment will be used instead of the system-wise packages)
conda activate my_env
# if it didn't work try `source activate my_env`
Afterward, you can install conda packages
conda install python pip matplotlib scikit-learn
If default conda channels don't have some package you search for other conda channels:
conda install dlib --channel=menpo
If your favorite package is not available anywhere in conda OR you would like to install OpenCV, try to install it via pip:
# check if you are using the `pip` from your `conda` env
which pip
pip install opencv-python
conda vs pip inside of conda env?
According to official anaconda documentation, you should install as many requirements as possible with conda, then use pip.
Another problem with pip packages inside of conda is associated with poor dependence handling and just bad experience when trying to replicate the same environment on another machine.
Alternatively, if you have an existing Conda environment for your code, it can be exported
conda activate my_env
conda env export > my_env.yml
module load miniforge3/24.9.0
conda env create -f my_env.yml
# This can take long
conda activate my_env
Several guides could be useful:
- Problems to run Conda with sbatch - Activating Conda Environments from Scripts: A Guide for Data Scientists
- Conda best practices for CSC supercomputers
Using Conda and existing modules (Better than only Conda)
Someone must write
Using Python virtual environment
Someone must write
Using containers
Someone must write