Commit d568eba7 authored by Sören Wacker's avatar Sören Wacker
Browse files

add I/O redirection section, remove file transfer (already in docs)

parent dfe99d19
Loading
Loading
Loading
Loading
+63 −79
Original line number Diff line number Diff line
@@ -10,8 +10,8 @@ description: >
By the end of this tutorial, you'll be able to:
- Navigate the DAIC filesystem confidently
- Create, copy, move, and delete files and directories
- Redirect command output (stdout/stderr) to files
- Find files and search their contents
- Transfer data to and from the cluster
- Write simple shell scripts to automate tasks

**Time**: About 30 minutes
@@ -22,9 +22,9 @@ By the end of this tutorial, you'll be able to:

You're a researcher who just got access to DAIC. You need to:
1. Set up a project directory
2. Upload your code and data
3. Organize your files
4. Find things when you forget where you put them
2. Organize your files
3. Find things when you forget where you put them
4. Automate repetitive tasks with scripts

Let's learn the commands you need by actually doing these tasks.

@@ -184,15 +184,58 @@ $ cat README.md
# ML Experiment
```

The `>` operator writes output to a file (overwriting if it exists). Use `>>` to append instead:
The `>` operator writes output to a file, **overwriting** any existing content.

### Output redirection

Every command has two output channels:
- **Standard output (stdout)** - normal output (file descriptor 1)
- **Standard error (stderr)** - error messages (file descriptor 2)

By default, both print to your terminal. Redirection lets you send them elsewhere.

**Redirect stdout to a file:**
```shell-session
$ echo "Started: $(date)" >> README.md
$ cat README.md
# ML Experiment
Started: Fri Mar 20 10:30:00 CET 2026
$ echo "Hello" > output.txt       # Overwrite file
$ echo "World" >> output.txt      # Append to file
$ cat output.txt
Hello
World
```

**Redirect stderr to a file:**
```shell-session
$ ls /nonexistent 2> errors.txt   # Errors go to file
$ cat errors.txt
ls: cannot access '/nonexistent': No such file or directory
```

**Redirect both stdout and stderr:**
```shell-session
$ python train.py > output.txt 2>&1    # Both to same file
$ python train.py &> output.txt        # Shorthand (bash 4+)
```

The `2>&1` syntax means "redirect file descriptor 2 (stderr) to wherever file descriptor 1 (stdout) is going."

**Separate files for stdout and stderr:**
```shell-session
$ python train.py > results.txt 2> errors.txt
```

**Discard output entirely:**
```shell-session
$ command > /dev/null 2>&1        # Discard everything
$ command 2> /dev/null            # Discard only errors
```

{{% alert title="Why this matters for HPC" color="info" %}}
Slurm jobs capture stdout and stderr to files. Understanding redirection helps you:
- Debug failed jobs by checking error output
- Keep logs clean by separating normal output from errors
- Combine outputs when needed with `2>&1`
{{% /alert %}}

### Exercise 2: Build your own structure

Create a directory structure for a different project:
@@ -450,68 +493,7 @@ $ grep -l "import" src/*.py # Just show filenames
The `find . -mtime -1` command should list files you recently created. The `grep -n` command shows line numbers where "print" appears. The directory search should show `./data` (and any other data directories you created).
{{% /alert %}}

## Part 7: Transferring files

You'll often need to move data between your local computer and DAIC.

### Uploading to DAIC (from your local machine)

Open a terminal on your local computer:

```shell-session
$ scp mydata.csv netid01@daic01.hpc.tudelft.nl:/tudelft.net/staff-umbrella/myproject/ml-experiment/data/raw/
```

Upload an entire directory:

```shell-session
$ scp -r ./local_data/ netid01@daic01.hpc.tudelft.nl:/tudelft.net/staff-umbrella/myproject/ml-experiment/data/
```

### Downloading from DAIC (to your local machine)

```shell-session
$ scp netid01@daic01.hpc.tudelft.nl:/tudelft.net/staff-umbrella/myproject/ml-experiment/results/output.csv ./
```

### Using rsync for large transfers

For large files or directories, `rsync` is better - it can resume interrupted transfers and only copies changed files:

```shell-session
$ rsync -avz --progress ./local_data/ netid01@daic01.hpc.tudelft.nl:/tudelft.net/staff-umbrella/myproject/data/
```

Options explained:
- `-a` = archive mode (preserves permissions, timestamps)
- `-v` = verbose (show what's happening)
- `-z` = compress during transfer
- `--progress` = show progress bar

### Exercise 5: Practice transfer

If you have a small file on your local machine, try uploading it:

```shell-session
$ echo "test data" > /tmp/test.txt
$ scp /tmp/test.txt netid01@daic01.hpc.tudelft.nl:~/linuxhome/
```

Then on DAIC:
```shell-session
$ cat ~/linuxhome/test.txt
test data
```

{{% alert title="Check your work" color="info" %}}
After the `scp` command, you should see:
```
test.txt                      100%   10     0.0KB/s   00:00
```
On DAIC, `cat ~/linuxhome/test.txt` should display "test data".
{{% /alert %}}

## Part 8: Automating with scripts
## Part 7: Automating with scripts

When you find yourself typing the same commands repeatedly, it's time to write a script.

@@ -617,7 +599,7 @@ TODAY=$(date +%Y-%m-%d)
echo "Running on $TODAY"
```

### Exercise 6: Write a cleanup script
### Exercise 5: Write a cleanup script

Create a script that removes old log files:

@@ -649,7 +631,7 @@ $ ls -l cleanup_logs.sh
The `x` in the permissions confirms it's executable. When run, it prints "Cleaning logs in logs/" and "Done!" (plus any files it removes).
{{% /alert %}}

## Part 9: Useful shortcuts and tips
## Part 8: Useful shortcuts and tips

### Tab completion

@@ -712,24 +694,26 @@ You've learned to:
| List files | `ls -la` |
| Change directory | `cd path` |
| Create directory | `mkdir -p path` |
| Create file | `echo "text" > file` |
| Create/overwrite file | `echo "text" > file` |
| Append to file | `echo "text" >> file` |
| Redirect stderr | `command 2> errors.txt` |
| Redirect both | `command > out.txt 2>&1` |
| View file | `cat file` or `less file` |
| Copy | `cp source dest` |
| Move/rename | `mv source dest` |
| Delete | `rm file` or `rm -r dir` |
| Find files | `find . -name "*.py"` |
| Search contents | `grep "pattern" file` |
| Upload to DAIC | `scp file user@daic01:path` |
| Download from DAIC | `scp user@daic01:path file` |
| Make script executable | `chmod +x script.sh` |

## What's next?

Now that you're comfortable with the command line:

1. [Slurm Tutorial](/tutorials/slurm/) - Learn to submit jobs to the cluster
2. [Vim Tutorial](/tutorials/vim/) - Edit files more efficiently
3. [Shell Setup](/quickstart/shell-setup/) - Configure your environment
1. [Data Transfer](/howto/data-transfer-quick/) - Move data to and from DAIC
2. [Slurm Tutorial](/tutorials/slurm/) - Learn to submit jobs to the cluster
3. [Vim Tutorial](/tutorials/vim/) - Edit files more efficiently
4. [Shell Setup](/quickstart/shell-setup/) - Configure your environment

## Quick reference