5 Makefile Essentials

This chapter introduces Make, a lightweight automation tool used to define and run repeatable tasks.
Makefiles help streamline workflows by turning multi-step processes into simple, declarative commands such as make data or make book. This approach improves reproducibility, reduces manual errors, and keeps your project organized.

5.1 A Simple QA (Quality Assurance) Target

This example Makefile target performs a quick data check by reporting the number of rows in a processed CSV and previewing its contents.

.PHONY: qa
qa:
    @echo "Rows:" && wc -l data/processed/prices_with_vol.csv
    @echo "Sample:" && head -n 5 data/processed/prices_with_vol.csv

Run it from the terminal:

make qa

5.2 Explanation

This QA target demonstrates key Makefile concepts:

.PHONY declares qa as a command, not a file.
wc -l counts rows in the processed CSV (quick validation).
head -n 5 previews file structure and dataset formatting.
The leading @ suppresses the command echo for cleaner output.

Make allows you to bundle commonly repeated actions into simple targets to improve efficiency and consistency.

5.3 A Full Project Makefile

Below is an example Makefile that reflects a typical data-science workflow used in this course:

.PHONY: env data db features book test clean

env:
    pip install -r requirements.txt

data:
    python scripts/make_synth_data.py

db:
    python scripts/make_sqlite.py

features:
    python scripts/build_features.py

book:
    quarto render book

test:
    pytest -q

clean:
    rm -rf db/*.db data/processed/* book/_site book/_freeze

5.4 What Each Target Does

5.4.1 `env`

Installs all required Python dependencies based on requirements.txt.
Ensures anyone cloning your repo can rebuild your environment in one command.

5.4.2 `data`

Generates synthetic raw data used throughout the book.
Running this target guarantees consistent input files.

5.4.3 `db`

Builds the SQLite database from processed CSVs.
Allows SQL queries in your pipeline to operate on a reproducible dataset.

5.4.4 `features`

Constructs engineered features such as log r

5.1 A Simple QA (Quality Assurance) Target

5.2 Explanation

5.3 A Full Project Makefile

5.4 What Each Target Does

5.4.1 env

5.4.2 data

5.4.3 db

5.4.4 features

5.4.1 `env`

5.4.2 `data`

5.4.3 `db`

5.4.4 `features`