This is the web page for Introduction to Data Science at the University of Florida.
Before you can use HiPerGator, you need to have an account on the HiPerGator cluster. If you don’t have an account, you can request one here.
As part of the class you have been given access to the HiPerGator cluster. This document will point you to resources that will help you understand the clusrter and get you startd.
Please take the time to read the all of this document. If you would like more information, you can signup for training through the New User Training page.
HiperGator is a high-performance computing cluster at the University of Florida. It is a shared resource that is available to all UF faculty, staff, and students. The cluster is managed by the Research Computing department at UF.
To use HiPerGator you will use your Gatorlink credentials. There are two ways to access HiPerGator, (1) Using the web interface, or Using SSH.
The HiPerGator web interface is called Open OnDemand. You can access OnDemand by going to https://ood.rc.ufl.edu. You can log in using your Gatorlink credentials.
Open on demand provides web interfaces for SSH, Jupyter Notebook, and may other tools.
Open on demand allows you to host Jupyter Notebooks on the cluster. You can select the jupyter notebook by selecting interactive apps and choosing Jupyter Notebook. You can reach jupyter notebook by selecting the Jupyter Notebook app. You can select the resources you need and start the notebook.
For more information visit the help wiki »
You can use SSH to connect to HiPerGator. You can use the following command to connect to Hipergator:
If you use VS Code for development, you can create a remote tunnel. Do not use the VS Code Remove extension to connect to HiPerGator. For details on using VS Code with HiPerGator, see the Research Computing Wiki Page.
The HiPerGator super computer is a shared resource. When you use SSH you connect to a login node. You can run commands on the login node, but you should not run long running jobs on the login node. Instead, you should submit your jobs to the scheduler.
HiPerGator uses a scheduler called Slurm.
You can submit jobs to the scheduler using the sbatch
or srun
command.
You can monitor your jobs using the squeue
command.
Slurm is a job scheduler that is used to manage the resources on the cluster. Below is a video that can help you understand how to use Slurm for your workloads.
For Spring 2025 for CIS 6930 data engineering, we we should use the following resources.
Your user name is the same as your gatorlink name.
You are also a part of the group cap5771.
You will need this group id to execute slurm commands by adding the command --qos=cap5771
.
You can check your group by running the command id
.
id gatorlink
Add the lines below for sbatch commands.
#SBATCH --qos=cap5771
The home directory will be is /blue/cap5771/gatorlink
.
You also have access to the class folder in the case there is a need to share resources.
The class share folder is /blue/cap5771/share
.
This folder, all subfolders, the account if this is your only account, and all data it contains, will be deleted shortly after the Date of Commencement, 04 May 2025, with no further warning. Any data they wish to save must be downloaded before that date.
The group has access to 64 cores, 500 GBs of memory, and 4 TBs or storage total. Each home directory has a limit of 40 GBs. If you require more space, contact the course staff.
You must agree to the following acceptable use policy to use the resources on HiPerGator.
ACCEPTABLE USE I acknowledge that the access to the HPC resources operated by UF Research Computing is subject to the UF Acceptable Use Policy at https://it.ufl.edu/policies/acceptable-use/acceptable-use-policy/ and the Research Computing policies at https://www.rc.ufl.edu/documentation/policies/ and that I am responsible for following these policies.
RESTRICTED DATA
I also certify that using restricted data and software on the HPC resources requires extra steps described at UFRC Policies and at UFRC Export Policies, and that I will notify both my account sponsor and the Office of Research (Research Compliance) and Research Computing at
Back to BACK