Bash basics

Essential command-line skills for working on DAIC.

What you’ll learn

By the end of this tutorial, you’ll be able to:

  • Navigate the DAIC filesystem confidently
  • Create, copy, move, and delete files and directories
  • Redirect command output (stdout/stderr) to files
  • Find files and search their contents
  • Write simple shell scripts to automate tasks

Time: About 30 minutes

Prerequisites: You should have logged into DAIC at least once.

The scenario

You’re a researcher who just got access to DAIC. You need to:

  1. Set up a project directory
  2. Organize your files
  3. Find things when you forget where you put them
  4. Automate repetitive tasks with scripts

Let’s learn the commands you need by actually doing these tasks.

Part 1: Finding your way around

When you log into DAIC, you arrive at your home directory. But where exactly are you, and what’s here?

Where am I?

The pwd command (print working directory) shows your current location:

$ pwd
/home/netid01

You’re in your home directory. On DAIC, this is a small space (5 MB) meant only for configuration files - not for your actual work.

What’s here?

The ls command lists what’s in the current directory:

$ ls
linuxhome

Not much! Let’s see more detail with ls -la:

$ ls -la
total 12
drwxr-xr-x   3 netid01 netid01 4096 Mar 20 09:00 .
drwxr-xr-x 100 root    root    4096 Mar 20 08:00 ..
-rw-r--r--   1 netid01 netid01  220 Mar 20 09:00 .bashrc
lrwxrwxrwx   1 netid01 netid01   45 Mar 20 09:00 linuxhome -> /tudelft.net/staff-homes-linux/n/netid01

Now we see hidden files (starting with .) and details about each file. The linuxhome entry has an arrow - it’s a symbolic link pointing to your larger personal storage.

Moving around

The cd command (change directory) moves you to a different location:

$ cd linuxhome
$ pwd
/home/netid01/linuxhome

Some useful shortcuts:

$ cd ..        # Go up one level
$ cd ~         # Go to home directory
$ cd -         # Go back to previous directory
$ cd           # Also goes to home directory

Exercise 1: Explore the filesystem

Try these commands and observe what happens:

$ cd /tudelft.net/staff-umbrella
$ ls
$ cd ~
$ pwd

Part 2: Understanding DAIC storage

Before we create files, let’s understand where to put them. DAIC has several storage locations:

LocationPurposeSize
/home/<netid>Config files only5 MB
~/linuxhomePersonal files, code~8 GB
/tudelft.net/staff-umbrella/<project>Project data and datasetsVaries

Rule of thumb:

  • Code and small files → linuxhome or umbrella
  • Large datasets → umbrella
  • Never put large files in /home

Let’s navigate to where you’ll do most of your work:

$ cd /tudelft.net/staff-umbrella
$ ls

You should see one or more project directories. For this tutorial, let’s assume you have access to a project called myproject:

$ cd myproject
$ pwd
/tudelft.net/staff-umbrella/myproject

Part 3: Creating a project structure

Now let’s set up a workspace for a machine learning project.

Creating directories

The mkdir command creates directories:

$ mkdir ml-experiment
$ cd ml-experiment
$ pwd
/tudelft.net/staff-umbrella/myproject/ml-experiment

Create multiple directories at once with -p (which also creates parent directories if needed):

$ mkdir -p data/raw data/processed models results logs
$ ls
data  logs  models  results
$ ls data
processed  raw

We’ve created this structure:

ml-experiment/
├── data/
│   ├── raw/
│   └── processed/
├── models/
├── results/
└── logs/

Creating files

Create a simple file with echo and redirection:

$ echo "# ML Experiment" > README.md
$ cat README.md
# ML Experiment

The > operator writes output to a file, overwriting any existing content.

Output redirection

Every command has two output channels:

  • Standard output (stdout) - normal output (file descriptor 1)
  • Standard error (stderr) - error messages (file descriptor 2)

By default, both print to your terminal. Redirection lets you send them elsewhere.

Redirect stdout to a file:

$ echo "Hello" > output.txt       # Overwrite file
$ echo "World" >> output.txt      # Append to file
$ cat output.txt
Hello
World

Redirect stderr to a file:

$ ls /nonexistent 2> errors.txt   # Errors go to file
$ cat errors.txt
ls: cannot access '/nonexistent': No such file or directory

Redirect both stdout and stderr:

$ python train.py > output.txt 2>&1    # Both to same file
$ python train.py &> output.txt        # Shorthand (bash 4+)

The 2>&1 syntax means “redirect file descriptor 2 (stderr) to wherever file descriptor 1 (stdout) is going.”

Separate files for stdout and stderr:

$ python train.py > results.txt 2> errors.txt

Discard output entirely:

$ command > /dev/null 2>&1        # Discard everything
$ command 2> /dev/null            # Discard only errors

Exercise 2: Build your own structure

Create a directory structure for a different project:

$ cd /tudelft.net/staff-umbrella/myproject
$ mkdir -p nlp-project/{data,src,notebooks,outputs}
$ ls nlp-project

Then create a README:

$ echo "# NLP Project" > nlp-project/README.md
$ echo "Author: $(whoami)" >> nlp-project/README.md
$ cat nlp-project/README.md

Part 4: Working with files

Let’s create some actual code to work with.

Creating a Python script

We’ll use cat with a “here document” to create a multi-line file:

$ cd /tudelft.net/staff-umbrella/myproject/ml-experiment
$ cat > train.py << 'EOF'
#!/usr/bin/env python3
"""Simple training script."""

import argparse

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('--epochs', type=int, default=10)
    parser.add_argument('--lr', type=float, default=0.001)
    args = parser.parse_args()

    print(f"Training for {args.epochs} epochs with lr={args.lr}")
    for epoch in range(args.epochs):
        print(f"Epoch {epoch+1}/{args.epochs}")
    print("Done!")

if __name__ == '__main__':
    main()
EOF

Verify the file was created:

$ cat train.py
$ ls -l train.py
-rw-r--r-- 1 netid01 netid01 423 Mar 20 10:35 train.py

Copying files

The cp command copies files:

$ cp train.py train_backup.py
$ ls
data  logs  models  README.md  results  train_backup.py  train.py

Copy entire directories with -r (recursive):

$ cp -r data data_backup
$ ls
data  data_backup  logs  models  README.md  results  train_backup.py  train.py

Moving and renaming

The mv command moves files. It’s also how you rename:

$ mv train_backup.py old_train.py      # Rename
$ mv old_train.py models/              # Move to models directory
$ ls models
old_train.py

Deleting files

The rm command removes files:

$ rm models/old_train.py
$ ls models

Delete directories with -r:

$ rm -r data_backup
$ ls
data  logs  models  README.md  results  train.py

Exercise 3: File operations

Practice by doing the following:

  1. Copy train.py to evaluate.py
  2. Create a src directory
  3. Move both Python files into src
  4. Verify with ls src
$ cp train.py evaluate.py
$ mkdir src
$ mv train.py evaluate.py src/
$ ls src
evaluate.py  train.py

Part 5: Viewing and editing files

Viewing file contents

Several commands let you view files:

$ cat src/train.py              # Print entire file
$ head -n 5 src/train.py        # First 5 lines
$ tail -n 5 src/train.py        # Last 5 lines
$ less src/train.py             # Scrollable viewer (q to quit)

For log files that are being written, tail -f shows new lines as they appear:

$ tail -f logs/training.log     # Watch live (Ctrl+C to stop)

Counting lines

$ wc -l src/train.py
18 src/train.py

Editing files

For quick edits, use nano (beginner-friendly):

$ nano src/train.py
  • Type to insert text
  • Ctrl+O to save
  • Ctrl+X to exit

For more power, use vim (see our Vim tutorial):

$ vim src/train.py

Part 6: Finding things

As your project grows, you’ll need to find files and search their contents.

Finding files by name

The find command searches for files:

$ find . -name "*.py"
./src/train.py
./src/evaluate.py

The . means “start from current directory”. Common options:

$ find . -name "*.py"                    # Files matching pattern
$ find . -type d -name "data*"           # Directories only
$ find . -type f -mtime -7               # Files modified in last 7 days
$ find . -size +100M                     # Files larger than 100MB

Searching inside files

The grep command searches file contents:

$ grep "epochs" src/train.py
    parser.add_argument('--epochs', type=int, default=10)
    print(f"Training for {args.epochs} epochs with lr={args.lr}")
    for epoch in range(args.epochs):

Search all Python files recursively:

$ grep -r "import" src/
src/train.py:import argparse

Useful options:

$ grep -n "epochs" src/train.py    # Show line numbers
$ grep -i "EPOCH" src/train.py     # Case-insensitive
$ grep -l "import" src/*.py        # Just show filenames
  1. Find all files modified in the last day:

    $ find . -mtime -1
    
  2. Search for all occurrences of “print” in your Python files:

    $ grep -n "print" src/*.py
    
  3. Find all directories named “data”:

    $ find . -type d -name "data"
    

Part 7: Automating with scripts

When you find yourself typing the same commands repeatedly, it’s time to write a script.

Your first script

Create a script that sets up a new experiment:

$ cat > setup_experiment.sh << 'EOF'
#!/bin/bash
# Setup script for new experiments

# Check if experiment name was provided
if [ -z "$1" ]; then
    echo "Usage: ./setup_experiment.sh <experiment_name>"
    exit 1
fi

EXPERIMENT_NAME=$1
BASE_DIR="/tudelft.net/staff-umbrella/myproject"

echo "Creating experiment: $EXPERIMENT_NAME"

# Create directory structure
mkdir -p "$BASE_DIR/$EXPERIMENT_NAME"/{data,models,results,logs}

# Create a README
cat > "$BASE_DIR/$EXPERIMENT_NAME/README.md" << README
# $EXPERIMENT_NAME

Created: $(date)
Author: $(whoami)

## Description
TODO: Add description

## Results
TODO: Add results
README

echo "Done! Experiment created at $BASE_DIR/$EXPERIMENT_NAME"
ls -la "$BASE_DIR/$EXPERIMENT_NAME"
EOF

Make it executable

Before you can run a script, you need to make it executable:

$ chmod +x setup_experiment.sh
$ ls -l setup_experiment.sh
-rwxr-xr-x 1 netid01 netid01 612 Mar 20 11:00 setup_experiment.sh

The x in the permissions means “executable”.

Run the script

$ ./setup_experiment.sh bert-finetuning
Creating experiment: bert-finetuning
Done! Experiment created at /tudelft.net/staff-umbrella/myproject/bert-finetuning
total 4
drwxr-xr-x 2 netid01 netid01 4096 Mar 20 11:00 data
drwxr-xr-x 2 netid01 netid01 4096 Mar 20 11:00 logs
drwxr-xr-x 2 netid01 netid01 4096 Mar 20 11:00 models
-rw-r--r-- 1 netid01 netid01  142 Mar 20 11:00 README.md
drwxr-xr-x 2 netid01 netid01 4096 Mar 20 11:00 results

Script building blocks

Here are patterns you’ll use often:

Variables:

NAME="experiment1"
echo "Working on $NAME"

Conditionals:

if [ -f "data.csv" ]; then
    echo "Data file exists"
else
    echo "Data file not found!"
    exit 1
fi

Loops:

for file in data/*.csv; do
    echo "Processing $file"
    python process.py "$file"
done

Command substitution:

TODAY=$(date +%Y-%m-%d)
echo "Running on $TODAY"

Exercise 5: Write a cleanup script

Create a script that removes old log files:

$ cat > cleanup_logs.sh << 'EOF'
#!/bin/bash
# Remove log files older than 7 days

LOG_DIR="${1:-.}"  # Use first argument, or current directory

echo "Cleaning logs in $LOG_DIR"

# Find and remove old logs
find "$LOG_DIR" -name "*.log" -mtime +7 -exec rm -v {} \;

echo "Done!"
EOF

$ chmod +x cleanup_logs.sh
$ ./cleanup_logs.sh logs/

Part 8: Useful shortcuts and tips

Tab completion

Press Tab to autocomplete:

  • Filenames
  • Directory names
  • Commands
$ cd /tudelft.net/staff-umb<TAB>
$ cd /tudelft.net/staff-umbrella/

Command history

$ history              # Show recent commands
$ !42                  # Run command number 42
$ !!                   # Run the last command
$ !grep                # Run the last command starting with "grep"

Press Ctrl+R to search history interactively.

Keyboard shortcuts

ShortcutAction
Ctrl+CCancel current command
Ctrl+DExit shell / end input
Ctrl+LClear screen
Ctrl+AMove to start of line
Ctrl+EMove to end of line
Ctrl+UDelete to start of line
Ctrl+KDelete to end of line

Aliases

Create shortcuts for common commands. Add to ~/.bashrc:

alias ll='ls -lah'
alias umbrella='cd /tudelft.net/staff-umbrella/myproject'
alias jobs='squeue -u $USER'

Then reload:

$ source ~/.bashrc
$ umbrella    # Now this works!

Summary

You’ve learned to:

TaskCommand
See current locationpwd
List filesls -la
Change directorycd path
Create directorymkdir -p path
Create/overwrite fileecho "text" > file
Append to fileecho "text" >> file
Redirect stderrcommand 2> errors.txt
Redirect bothcommand > out.txt 2>&1
View filecat file or less file
Copycp source dest
Move/renamemv source dest
Deleterm file or rm -r dir
Find filesfind . -name "*.py"
Search contentsgrep "pattern" file
Make script executablechmod +x script.sh

What’s next?

Now that you’re comfortable with the command line:

  1. Data Transfer - Move data to and from DAIC
  2. Slurm Tutorial - Learn to submit jobs to the cluster
  3. Vim Tutorial - Edit files more efficiently
  4. Shell Setup - Configure your environment

Quick reference

For more advanced shell customization, see Shell Setup.