6  SQL with SQLite

This chapter demonstrates how to create a small SQLite database and run SQL joins using Python.
SQLite provides a lightweight, file-based engine that is perfect for small, reproducible workflows.

6.1 6.1 Creating a SQLite Database (scripts/make_sqlite.py)

Instead of loading prices_with_vol.csv (which may not exist on a fresh clone), we generate a tiny synthetic dataset so the example always runs successfully.

import sqlite3, pandas as pd, pathlib

# Ensure db folder exists
pathlib.Path("db").mkdir(exist_ok=True)

# Create or overwrite SQLite database
con = sqlite3.connect("db/prices.db")

# ---- Synthetic price table ----
prices_df = pd.DataFrame({
    "ticker": ["AAPL", "AAPL", "MSFT", "MSFT"],
    "date": ["2023-01-01", "2023-01-02", "2023-01-01", "2023-01-02"],
    "adj_close": [125.0, 126.5, 240.0, 242.3],
    "volume": [100000, 120000, 150000, 160000]
})

# Save synthetic data into SQLite table
prices_df.to_sql("prices", con, if_exists="replace", index=False)

# ---- Metadata table ----
meta = pd.DataFrame({
    "ticker": ["AAPL", "MSFT"],
    "sector": ["InfoTech", "InfoTech"]
})

meta.to_sql("meta", con, if_exists="replace", index=False)

prices_df.head()

6.2 6.2 Running an SQL Join

q = """
SELECT p.ticker, p.date, p.adj_close, p.volume, m.sector
FROM prices p
JOIN meta m USING(ticker)
ORDER BY date
LIMIT 5;
"""

pd.read_sql(q, con)

6.3 Explanation

  • to_sql() writes pandas DataFrames directly into SQLite tables
  • The SQL join merges prices with meta on the key ticker
  • The example always works because it uses synthetic data instead of relying on missing CSV files