6 SQL with SQLite

This chapter demonstrates how to create a small SQLite database and run SQL joins using Python.
SQLite provides a lightweight, file-based engine that is perfect for small, reproducible workflows.

6.1 6.1 Creating a SQLite Database (scripts/make_sqlite.py)

Instead of loading prices_with_vol.csv (which may not exist on a fresh clone), we generate a tiny synthetic dataset so the example always runs successfully.

import sqlite3, pandas as pd, pathlib

# Ensure db folder exists
pathlib.Path("db").mkdir(exist_ok=True)

# Create or overwrite SQLite database
con = sqlite3.connect("db/prices.db")

# ---- Synthetic price table ----
prices_df = pd.DataFrame({
    "ticker": ["AAPL", "AAPL", "MSFT", "MSFT"],
    "date": ["2023-01-01", "2023-01-02", "2023-01-01", "2023-01-02"],
    "adj_close": [125.0, 126.5, 240.0, 242.3],
    "volume": [100000, 120000, 150000, 160000]
})

# Save synthetic data into SQLite table
prices_df.to_sql("prices", con, if_exists="replace", index=False)

# ---- Metadata table ----
meta = pd.DataFrame({
    "ticker": ["AAPL", "MSFT"],
    "sector": ["InfoTech", "InfoTech"]
})

meta.to_sql("meta", con, if_exists="replace", index=False)

prices_df.head()

6.2 6.2 Running an SQL Join

q = """
SELECT p.ticker, p.date, p.adj_close, p.volume, m.sector
FROM prices p
JOIN meta m USING(ticker)
ORDER BY date
LIMIT 5;
"""

pd.read_sql(q, con)

6.3 Explanation

to_sql() writes pandas DataFrames directly into SQLite tables
The SQL join merges prices with meta on the key ticker
The example always works because it uses synthetic data instead of relying on missing CSV files