Insights

How to Run an Event Study in Python

Alphanume Team · June 9, 2026

Windows, benchmarks, and significance testing, in code.

An event study python pipeline answers a simple question: did something happen to stock returns around a specific corporate event? You pick an event set, define a benchmark model, measure how returns deviated from that benchmark, and then test whether the deviation was statistically meaningful. Done naively, the method is riddled with traps — look-ahead bias, overlapping windows, survivorship bias in the event set. This post builds the full pipeline as composable functions, so you can see exactly where each trap lives and how to sidestep it. For a deeper discussion of the research design decisions, see how to design an event study before writing a line of code.

The setup

The same get_data helper used throughout these tutorials works here unchanged. Store your key in the environment, point at the base URL, and unwrap the data envelope. Alongside requests and pandas you will need numpy for the regression and scipy.stats for the t-test.

import os
import requests
import pandas as pd
import numpy as np
from scipy import stats

BASE_URL = "https://api.alphanume.com/v1"
API_KEY = os.environ["ALPHANUME_API_KEY"]

def get_data(endpoint, **params):
    params["api_key"] = API_KEY
    resp = requests.get(f"{BASE_URL}/{endpoint}", params=params, timeout=30)
    resp.raise_for_status()
    return resp.json()["data"]

The raise_for_status() call is not optional. A silent 401 or rate-limit response will return an error body that pandas will parse into a DataFrame of noise rather than raising an exception. Fail loudly, fix the problem, then move on.

Step 1 — build the event set from dilution data

An event study needs a list of (ticker, event_date) pairs. Using a published, timestamped data source rather than hand-curating dates keeps the study point-in-time: you can only use events that were knowable at the time. The Stock Dilution dataset exposes dated equity offering records — share sales, ATM programs, shelf registrations — which makes it a natural event set for studying how dilutive issuance affects price.

def get_events(ticker):
    """Return a DataFrame of offering events for one ticker."""
    rows = get_data("dilution", ticker=ticker)
    if not rows:
        return pd.DataFrame(columns=["ticker", "event_date"])
    df = pd.DataFrame(rows)
    df["event_date"] = pd.to_datetime(df["date"])
    df["ticker"] = ticker
    return df[["ticker", "event_date"]].drop_duplicates()

Call this across your universe, concatenate, and deduplicate. If the same ticker has two events within your event window, you will need to drop the overlap — we handle that in the alignment step.

Step 2 — define estimation and event windows

The event window is the short period centered on day 0 where you expect the effect to show up — typically [-1, +1] or [-2, +2] trading days. The estimation window is a longer pre-event period used to fit the benchmark model — typically 120 to 250 trading days ending 30 or more days before the event. The gap between them is deliberate: contaminating the estimation window with event-period returns would bias the model toward zero abnormal returns.

EST_START = -252   # trading days before event
EST_END   = -31    # estimation window ends here
EVT_START = -2     # event window starts here
EVT_END   = +2     # event window ends here

These are offsets in event time — integers relative to day 0. Converting them to calendar dates requires a price series, which you fetch next.

Step 3 — align returns into event time

For each event, fetch the stock's daily return series and a market index return series (the benchmark), then reindex both onto an integer event-time axis. Aligning on trading-day offsets rather than calendar dates handles weekends, holidays, and variable month lengths automatically.

def align_to_event_time(ticker, event_date, price_rows, market_rows):
    """
    Slice stock and market returns around an event date.
    Returns a DataFrame indexed by event-time integer offset.
    price_rows and market_rows are lists of {"date": ..., "close": ...}.
    """
    stock = (
        pd.DataFrame(price_rows)
        .assign(date=lambda d: pd.to_datetime(d["date"]))
        .set_index("date")
        .sort_index()["close"]
        .pct_change()
        .dropna()
    )
    mkt = (
        pd.DataFrame(market_rows)
        .assign(date=lambda d: pd.to_datetime(d["date"]))
        .set_index("date")
        .sort_index()["close"]
        .pct_change()
        .dropna()
    )
    # align to a shared trading-day index
    combined = pd.concat({"stock": stock, "mkt": mkt}, axis=1).dropna()
    if event_date not in combined.index:
        return None  # event fell on a non-trading day or missing data
    loc = combined.index.get_loc(event_date)
    lo = loc + EST_START
    hi = loc + EVT_END + 1
    if lo < 0 or hi > len(combined):
        return None  # not enough history or future data
    sliced = combined.iloc[lo:hi].copy()
    sliced["t"] = range(EST_START, EVT_END + 1)
    return sliced.set_index("t")

Returning None for short histories or missing event dates is safer than returning a partial window — the aggregation step will skip None results, preserving the validity of the cross-sectional average.

Step 4 — fit the market model and compute abnormal returns

The market model regresses the stock's estimation-window returns on the market's estimation-window returns. The intercept and slope from that regression define the expected return. Abnormal return is the residual: actual minus expected. Computing cumulative abnormal returns in pandas walks through the mechanics in more detail — here we implement the essential steps as a single function.

def compute_car(aligned):
    """
    Fit market model on estimation window; return event-window CAR.
    aligned: DataFrame with columns [stock, mkt] indexed by event-time int.
    """
    est = aligned.loc[EST_START:EST_END]
    if len(est) < 60:
        return None  # too few observations to trust the regression
    slope, intercept, *_ = stats.linregress(est["mkt"], est["stock"])
    evt = aligned.loc[EVT_START:EVT_END].copy()
    evt["expected"] = intercept + slope * evt["mkt"]
    evt["ar"] = evt["stock"] - evt["expected"]
    return float(evt["ar"].sum())   # CAR for this event

The 60-observation floor is a soft guard against thin histories — a regression on ten days of data will overfit to noise. Adjust the threshold to taste, but document whatever you choose so the study is reproducible.

Step 5 — aggregate and test significance

The cumulative average abnormal return (CAAR) is the cross-sectional mean of the per-event CARs. Statistical significance comes from a cross-sectional t-test: under the null hypothesis that the event has no effect, the CARs are draws from a distribution with mean zero, and the t-statistic is the sample mean divided by the standard error.

def caar_ttest(cars):
    """
    Cross-sectional t-test on a list of per-event CARs.
    Returns (caar, t_stat, p_value, n).
    """
    arr = np.array([c for c in cars if c is not None], dtype=float)
    n = len(arr)
    if n < 2:
        return float("nan"), float("nan"), float("nan"), n
    caar = arr.mean()
    se = arr.std(ddof=1) / np.sqrt(n)
    t_stat = caar / se
    p_value = 2 * stats.t.sf(abs(t_stat), df=n - 1)
    return caar, t_stat, p_value, n

A two-tailed p-value lets you detect both positive and negative abnormal returns without pre-committing to a direction. Use a one-tailed test only when you have a strong theoretical prior about the sign — and state that choice explicitly in your write-up.

Correctness rails and common mistakes

Several errors are easy to commit and hard to spot in output:

Overlapping estimation windows. If two events for the same ticker are close together, the estimation window of the later event may overlap with the event window of the earlier one. Check for this and drop the conflicting observation rather than silently biasing the model.
Look-ahead in the event set. Only include events that were publicly announced before or on event_date. Backdated or restated announcement dates will inflate your sample's abnormal returns artificially.
Survivorship bias. Do not filter your universe to companies still trading today before pulling events. Include delisted tickers — their events happened and their returns were real.
Return contamination. The gap between EST_END and EVT_START (here, 28 trading days) prevents event-period price pressure from leaking into the estimation window and shrinking measured abnormal returns.

End-to-end skeleton

The functions above compose into a straightforward pipeline. Fetch events, align each one, compute CARs, and aggregate. The skeleton below wires them together — replace the price-fetching stubs with your actual data calls.

def run_event_study(tickers):
    all_cars = []
    for ticker in tickers:
        events = get_events(ticker)
        if events.empty:
            continue
        # fetch full price history once per ticker (stub — replace with real call)
        price_rows  = get_data("prices", ticker=ticker)
        market_rows = get_data("prices", ticker="SPY")
        for _, row in events.iterrows():
            aligned = align_to_event_time(
                ticker, row["event_date"], price_rows, market_rows
            )
            if aligned is None:
                continue
            car = compute_car(aligned)
            all_cars.append(car)
    caar, t, p, n = caar_ttest(all_cars)
    print(f"N={n}  CAAR={caar:.4f}  t={t:.3f}  p={p:.4f}")
    return {"caar": caar, "t": t, "p": p, "n": n}

if __name__ == "__main__":
    universe = ["AAPL", "MSFT", "NVDA", "AMZN", "META"]
    run_event_study(universe)

Keep the fetch-and-align step separate from the statistical step. That separation makes it easy to cache aligned windows to disk, iterate on window lengths without re-fetching data, or swap in a different benchmark model — a Fama-French three-factor model, for instance — without touching the data layer. The pipeline handles thin histories, missing event dates, and short price series gracefully by returning None and excluding those observations from the aggregate, so the t-statistic is never computed on a contaminated sample.