Commit 1e9c4681 authored by Sören Wacker's avatar Sören Wacker
Browse files

add consistent structure and exercises to all tutorials

parent a8e2a0b2
Loading
Loading
Loading
Loading
Loading
+97 −5
Original line number Diff line number Diff line
@@ -10,9 +10,101 @@ menu:
    weight: 2
---

In-depth tutorials for learning DAIC:
## Learn DAIC from the ground up

- [Bash Basics](/tutorials/bash/) - Essential command-line skills
- [Slurm Basics](/tutorials/slurm/) - Understanding the job scheduler
- [Apptainer Containers](/tutorials/apptainer/) - Containerize your software environment
- [Vim Basics](/tutorials/vim/) - Edit files efficiently
These tutorials take you from first login to running GPU workloads. Each tutorial builds on the previous one, so we recommend following them in order.

```
┌─────────────────────────────────────────────────────────────────────────┐
│                           YOUR COMPUTER                                  │
│    You write code, prepare data, connect via SSH                        │
└────────────────────────────────┬────────────────────────────────────────┘
                                 │ SSH

┌─────────────────────────────────────────────────────────────────────────┐
│                           LOGIN NODE                                     │
│    daic01.hpc.tudelft.nl                                                │
│    - Prepare scripts                                                    │
│    - Submit jobs (sbatch)                                               │
│    - Monitor jobs (squeue)                                              │
│    - Edit files (vim, nano)                                             │
│    - Transfer data (scp, rsync)                                         │
│                                                                         │
│    DO NOT run computations here!                                        │
└────────────────────────────────┬────────────────────────────────────────┘
                                 │ Slurm

┌─────────────────────────────────────────────────────────────────────────┐
│                         COMPUTE NODES                                    │
│    gpu01, gpu02, ... gpu45                                              │
│    - Run your training scripts                                          │
│    - Access GPUs (L40, A40, RTX Pro 6000)                              │
│    - Process large datasets                                             │
│                                                                         │
│    Managed by Slurm - request resources with sbatch/salloc              │
└────────────────────────────────┬────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────────────────┐
│                           STORAGE                                        │
│    /home/<netid>              - 5 MB, config only                       │
│    ~/linuxhome                - 8 GB, personal files                    │
│    /tudelft.net/staff-umbrella/<project>  - Project data                │
│    /tudelft.net/staff-bulk/<project>      - Large datasets              │
└─────────────────────────────────────────────────────────────────────────┘
```

## The learning path

| Tutorial | Time | What you'll learn |
|----------|------|-------------------|
| [Bash Basics](/tutorials/bash/) | 30 min | Navigate the filesystem, manage files, write scripts |
| [Slurm Basics](/tutorials/slurm/) | 45 min | Submit jobs, request GPUs, monitor your work |
| [Apptainer](/tutorials/apptainer/) | 45 min | Package your environment in containers |
| [Vim](/tutorials/vim/) | 30 min | Edit files efficiently on the cluster |

## Which tutorial do I need?

**I just got access to DAIC**
→ Start with [Bash Basics](/tutorials/bash/), then [Slurm Basics](/tutorials/slurm/)

**I know Linux but not clusters**
→ Start with [Slurm Basics](/tutorials/slurm/)

**My code needs specific packages/versions**
→ Read [Apptainer](/tutorials/apptainer/) to containerize your environment

**I need to edit files on the cluster**
→ Learn [Vim](/tutorials/vim/) for efficient editing over SSH

## What you'll be able to do

After completing these tutorials, you'll be able to:

1. Log into DAIC and navigate the filesystem
2. Organize your projects with proper directory structures
3. Transfer data between your computer and the cluster
4. Submit batch jobs that run overnight
5. Request GPUs for deep learning training
6. Run parameter sweeps with job arrays
7. Package complex environments in containers
8. Edit files directly on the cluster

## Getting help

- **Stuck on a command?** Try `man command` or `command --help`
- **Cluster-specific questions?** See our [FAQs](/support/faqs/)
- **Something broken?** Contact [support](/support/contact/)

## Tutorial format

Each tutorial follows the same structure:

- **What you'll learn** - Clear objectives
- **Prerequisites** - What you need to know first
- **Time** - Approximate duration
- **Hands-on exercises** - Practice as you learn
- **Summary** - Key takeaways
- **What's next** - Where to go from here

Now let's get started with [Bash Basics](/tutorials/bash/).
+96 −0
Original line number Diff line number Diff line
@@ -5,6 +5,19 @@ description: >
  Using Apptainer to containerize environments.
---

## What you'll learn

- Understand why containers are useful for HPC workloads
- Pull prebuilt images from Docker Hub and NVIDIA NGC
- Build custom container images from definition files
- Run containerized applications on DAIC with GPU support
- Manage bind mounts and cache directories

**Prerequisites:** [Slurm Basics](/tutorials/slurm/) (submitting jobs, requesting GPUs)

**Time:** 45 minutes

---

## What and Why containerization?

@@ -554,3 +567,86 @@ To make this permanent, add to your shell profile:
echo 'export PATH=$PATH:/path/to/software/bin' >> ~/.bashrc
```
{{% /alert %}}

---

## Exercises

Practice what you've learned with these hands-on exercises.

### Exercise 1: Pull and explore an image

Pull the `python:3.11-slim` image from Docker Hub and explore it:

1. Use `apptainer pull` to download the image
2. Use `apptainer shell` to open an interactive session
3. Check the Python version inside the container
4. List the contents of `/usr/local/lib/python3.11/`
5. Exit the container

### Exercise 2: Run a command in a container

Using the Python image from Exercise 1:

1. Create a simple Python script `hello.py` that prints "Hello from Apptainer!"
2. Use `apptainer exec` to run the script inside the container
3. Try running it with the `-C` flag - what happens to your script?

### Exercise 3: Build a custom image

Create a definition file for a container with your favorite tools:

1. Start from `ubuntu:22.04`
2. Install at least two packages (e.g., `curl` and `jq`)
3. Add a `%runscript` that displays a welcome message
4. Build the image and test it with `apptainer run`

### Exercise 4: GPU container on DAIC

Test GPU access with a prebuilt image:

1. Request an interactive GPU session with `salloc`
2. Pull or use an existing PyTorch NGC image
3. Run a Python command that checks `torch.cuda.is_available()`
4. Verify the GPU is detected with `nvidia-smi` inside the container

### Exercise 5: Bind mounts

Practice data isolation:

1. Create a directory with a test file
2. Run a container with `-C` (isolated) and `--bind` to mount only that directory
3. Inside the container, verify you can access the test file but not your home directory
4. Try mounting the directory as read-only with `--mount`

---

## Summary

You learned how to:

- **Pull images** from Docker Hub and NVIDIA NGC
- **Build images** from definition files with `%post` and `%runscript` sections
- **Run containers** with `shell`, `exec`, and `run` commands
- **Enable GPU access** with the `--nv` flag
- **Isolate filesystems** with `-C` and selectively expose directories with `--bind`
- **Manage cache** by setting `APPTAINER_CACHEDIR`

### Key commands

| Command | Purpose |
|---------|---------|
| `apptainer pull docker://image:tag` | Download image from registry |
| `apptainer build image.sif recipe.def` | Build image from definition file |
| `apptainer shell image.sif` | Interactive shell in container |
| `apptainer exec image.sif command` | Run single command in container |
| `apptainer run image.sif` | Execute container's runscript |
| `--nv` | Enable GPU passthrough |
| `-C` | Isolate container filesystem |
| `--bind host:container` | Mount host directory in container |

### What's next?

- Learn [Vim](/tutorials/vim/) for editing files directly on the cluster
- See [Container GPU Jobs](/howto/container-gpu-job/) for batch job examples
- Explore [Apptainer documentation](https://apptainer.org/docs/user/main/) for advanced features
+53 −3
Original line number Diff line number Diff line
@@ -5,6 +5,21 @@ description: >
  Understanding the job scheduler on DAIC.
---

## What you'll learn

By the end of this tutorial, you'll be able to:
- Submit batch jobs that run on compute nodes
- Request CPUs, memory, and GPUs for your jobs
- Monitor job status and troubleshoot failures
- Use interactive sessions for testing
- Run parameter sweeps with job arrays

**Time**: About 45 minutes

**Prerequisites**: Complete the [Bash Basics](/tutorials/bash/) tutorial first, or be comfortable with Linux command line.

---

## What is Slurm?

When you log into DAIC, you land on a **login node**. This is a shared computer where users prepare their work - but you shouldn't run computations here. The actual computing happens on **compute nodes**, powerful machines with GPUs and lots of memory.
@@ -723,8 +738,43 @@ After your first successful run, check `seff` and adjust requests.
| `--output` | `log_%j.out` | Output file |
| `--array` | `1-10` | Job array |

## Summary

You've learned:

| Concept | Key Commands |
|---------|--------------|
| Submit a batch job | `sbatch script.sh` |
| Request interactive session | `salloc --time=1:00:00 --gres=gpu:1 ...` |
| Run on allocated node | `srun python train.py` |
| Check job status | `squeue -u $USER` |
| Cancel a job | `scancel <jobid>` |
| View job history | `sacct -u $USER` |
| Check efficiency | `seff <jobid>` |
| Run parameter sweep | `#SBATCH --array=1-10` |
| Chain jobs | `--dependency=afterok:<jobid>` |

## Exercises

Try these on your own to solidify your understanding:

### Exercise 1: Basic job submission
Create and submit a job that prints your username, hostname, and current date. Check the output.

### Exercise 2: GPU job
Modify the basic job to request a GPU. Add `nvidia-smi` to verify the GPU is available.

### Exercise 3: Resource tuning
Submit a job, then use `seff` to check its efficiency. Was your resource request appropriate?

### Exercise 4: Job array
Create a job array that runs 5 tasks. Each task should print its array task ID.

### Exercise 5: Dependencies
Submit two jobs where the second depends on the first completing successfully.

## Next steps

- [First Job](/quickstart/first-job/) - Hands-on first submission
- [Modules](/docs/software/modules/) - Loading software
- [Apptainer Tutorial](/tutorials/apptainer/) - Using containers
- [Apptainer Tutorial](/tutorials/apptainer/) - Package your environment in containers
- [Vim Tutorial](/tutorials/vim/) - Edit files efficiently on the cluster
- [Modules](/docs/software/modules/) - Load pre-installed software
+64 −3
Original line number Diff line number Diff line
@@ -5,6 +5,21 @@ description: >
  Learn the Vim text editor for efficient file editing on DAIC.
---

## What you'll learn

By the end of this tutorial, you'll be able to:
- Open, edit, save, and quit files in Vim
- Navigate efficiently without touching the mouse
- Delete, copy, and paste text
- Search and replace
- Edit SLURM scripts and Python code on the cluster

**Time**: About 30 minutes

**Prerequisites**: Basic familiarity with command line. Complete [Bash Basics](/tutorials/bash/) first if you're new to Linux.

---

## Why learn Vim?

When working on DAIC, you'll often need to edit files directly on the cluster - tweaking a batch script, fixing a bug in your code, or checking a configuration file. Since DAIC is accessed via SSH (no graphical interface), you need a terminal-based text editor.
@@ -724,8 +739,54 @@ Then gradually add new commands as the basic ones become automatic.
| `cw` | Change word |
| `.` | Repeat last change |

## Summary

You've learned the essential Vim workflow:

| Task | Commands |
|------|----------|
| Open a file | `vim filename` |
| Enter insert mode | `i`, `a`, `o` |
| Return to normal mode | `Esc` |
| Save | `:w` |
| Quit | `:q` or `:wq` |
| Navigate | `hjkl`, `w`, `b`, `gg`, `G` |
| Delete | `x`, `dd`, `dw` |
| Copy/paste | `yy`, `p` |
| Undo/redo | `u`, `Ctrl+r` |
| Search | `/pattern`, `n`, `N` |
| Replace | `:%s/old/new/g` |
| Select lines | `V` + movement |

## Exercises

Practice these tasks to build muscle memory:

### Exercise 1: Basic editing
Create a new file, add three lines of text, save and quit. Then reopen it and verify your changes.

### Exercise 2: Navigation
Open a Python file and practice: go to end (`G`), go to beginning (`gg`), jump by words (`w`, `b`), go to specific line (`10G`).

### Exercise 3: Delete and undo
Open a file, delete a line (`dd`), undo (`u`), delete a word (`dw`), undo again.

### Exercise 4: Copy and paste
Copy a line (`yy`), move to a new location, paste it (`p`). Then try with multiple lines using `V`.

### Exercise 5: Search and replace
Open a file and search for a word (`/word`). Then replace all occurrences of one word with another (`:%s/old/new/g`).

### Exercise 6: Real task
Edit a SLURM batch script: change the time limit, add a new `#SBATCH` directive, and save.

## Keep learning

- Run `vimtutor` for a 30-minute interactive tutorial
- Practice daily - even small edits help build muscle memory
- Add one new command to your repertoire each week

## Next steps

- Practice with `vimtutor`
- [Slurm Tutorial](/tutorials/slurm/) - Submit jobs
- [Apptainer Tutorial](/tutorials/apptainer/) - Use containers
- [Slurm Tutorial](/tutorials/slurm/) - Submit jobs to the cluster
- [Apptainer Tutorial](/tutorials/apptainer/) - Package your environment