6 SQL with SQLite
This chapter demonstrates how to create a small SQLite database and run SQL joins using Python.
SQLite provides a lightweight, file-based engine that is perfect for small, reproducible workflows.
6.1 6.1 Creating a SQLite Database (scripts/make_sqlite.py)
Instead of loading prices_with_vol.csv (which may not exist on a fresh clone), we generate a tiny synthetic dataset so the example always runs successfully.
import sqlite3, pandas as pd, pathlib
# Ensure db folder exists
pathlib.Path("db").mkdir(exist_ok=True)
# Create or overwrite SQLite database
con = sqlite3.connect("db/prices.db")
# ---- Synthetic price table ----
prices_df = pd.DataFrame({
"ticker": ["AAPL", "AAPL", "MSFT", "MSFT"],
"date": ["2023-01-01", "2023-01-02", "2023-01-01", "2023-01-02"],
"adj_close": [125.0, 126.5, 240.0, 242.3],
"volume": [100000, 120000, 150000, 160000]
})
# Save synthetic data into SQLite table
prices_df.to_sql("prices", con, if_exists="replace", index=False)
# ---- Metadata table ----
meta = pd.DataFrame({
"ticker": ["AAPL", "MSFT"],
"sector": ["InfoTech", "InfoTech"]
})
meta.to_sql("meta", con, if_exists="replace", index=False)
prices_df.head()6.2 6.2 Running an SQL Join
q = """
SELECT p.ticker, p.date, p.adj_close, p.volume, m.sector
FROM prices p
JOIN meta m USING(ticker)
ORDER BY date
LIMIT 5;
"""
pd.read_sql(q, con)6.3 Explanation
to_sql()writes pandas DataFrames directly into SQLite tables
- The SQL join merges
priceswithmetaon the keyticker
- The example always works because it uses synthetic data instead of relying on missing CSV files