▶ Full Post Text
**Methodology**
I pulled 906,088 Form 4 filings from SEC EDGAR covering January 2023 through March 2026. Filtered to open market purchases only (transaction code P), excluded grants, awards, and tax-related transactions. The headline analysis further filters to C-suite insiders (CEO, CFO, COO, Chairman) with purchases of $100K or more, giving 3,236 backtestable signals across 1,169 unique tickers.
Entry: next trading day open after the filing date — not the transaction date, since the public doesn't know about the trade until the filing hits EDGAR. Exit: closing price at 5, 10, 30, 60, and 90 calendar days. Benchmark: SPY over the same window. Excess return = stock return minus SPY return, minus 10bps round-trip transaction cost. All prices are split-adjusted.
Survivorship note: roughly 14% of signals were excluded because the ticker was delisted and price data was unavailable. This biases results slightly upward since delisted stocks skew negative.
**The core finding: it's a short-term signal**
|Window|Mean Excess Return|Win Rate|p-value|
|:-|:-|:-|:-|
|5 day|\+0.98%|51.2%|<0.0001|
|10 day|\+0.97%|51.3%|<0.0001|
|30 day|\+0.02%|43.8%|0.93|
|60 day|\-1.56%|40.6%|0.0003|
|90 day|\-1.59%|38.2%|0.003|
The signal is statistically significant at 5 and 10 days, then it's gone. By 60 and 90 days, insider buy signals actually underperform SPY, and that underperformance is also statistically significant. This isn't "insiders know the future" — it's a filing-reaction effect that decays quickly.
**Cluster buys are the real signal**
The strongest finding in the dataset. A "cluster" is 2+ distinct insiders making open market purchases of the same stock within 5 trading days of each other.
||5 day|10 day|30 day|
|:-|:-|:-|:-|
|Cluster buys (N=820)|\+2.02%|\+2.41%|\+2.29%|
|Single insider (N=1,997)|\+0.62%|\+0.50%|\-0.20%|
|Difference significant?|p=0.0001|p<0.0001|p=0.016|
One insider buying could mean anything — portfolio rebalancing, compensation-related, contractual. Two or more insiders independently buying within the same week is a different signal entirely. The cluster effect persists through 30 days, unlike single insider buys which fade by day 10.
1,472 clusters identified in the dataset.
**Sector breakdown**
Healthcare stands out. At the sub-industry level, biotech specifically drives the result.
|Sector|5d Excess|10d Excess|N|
|:-|:-|:-|:-|
|Healthcare|\+3.03%\*\*\*|\+2.28%\*\*|443|
|Consumer Cyclical|\+1.27%\*|\+2.14%\*\*\*|325|
|Financial Services|\+0.49%\*|\+0.48%|640|
|Technology|\+0.81%|\+1.40%\*|380|
|Real Estate|\+0.62%|\-0.79%|291|
|Energy|\-0.41%|\+0.49%|135|
Within Healthcare, biotechnology insiders generated +4.8% excess at 5 days (N=152, p<0.001). This makes sense — biotech has the highest information asymmetry between insiders and the market.
**Filter combinations**
Every strong combination has cluster buying as the base:
|Filter|10d Excess|N|
|:-|:-|:-|
|Cluster + Healthcare|\+5.65%|120|
|Cluster + CEO/Chairman|\+5.19%|97|
|Cluster + Conviction >50%|\+4.90%|117|
|Cluster alone|\+2.41%|820|
|No filter (C-suite ≥$100K)|\+0.97%|3,236|
Sample sizes get small in the combinations, so treat the exact numbers with appropriate skepticism. The directional finding — that clusters multiply signal strength — is robust.
**Things that don't matter (as much as you'd think)**
*Transaction size:* No statistically significant difference between $100K-$500K and $5M+ purchases at any window. The t-tests are all non-significant. Bigger buy ≠ better signal.
*Position conviction:* Insiders doubling their position (+100% increase) show marginally better returns than insiders adding 10%, but the difference isn't dramatic. The short-term signal exists at all conviction levels.
*Filing speed:* Insiders who file within 0-5 days of the transaction show similar short-term returns. One exception: insiders who take 6+ days to file show -15% at 60 days — this is a red flag, not a signal to follow.
**Market regime**
|Regime|5d Excess|10d Excess|N|
|:-|:-|:-|:-|
|Bull (SPY 3mo >+5%)|\+1.29%\*\*\*|\+1.55%\*\*\*|1,368|
|Flat (SPY 3mo ±5%)|\+0.77%\*\*\*|\+0.47%|1,452|
|Bear (SPY 3mo <-5%)|\+1.68%\*|\+2.15%\*|205|
The short-term signal works across all market regimes. Bear market sample is small (N=205) so I wouldn't overweight that result, but the signal isn't just a bull market artifact.
**Limitations**
These should be obvious but worth stating:
* The analysis period (2023-2026) was broadly bullish. Three years isn't enough to generalize across full market cycles.
* Survivorship bias from excluded delisted tickers likely inflates returns by some amount.
* No size-factor or sector-factor risk adjustment — the SPY benchmark doesn't control for the fact that insider buy signals may cluster in small caps or specific sectors. The market cap analysis suggests the signal isn't micro-cap-only, but a Fama-French adjustment would be more rigorous.
* The 2026 partial year includes the tariff shock period with very small N and anomalous results.
* Transaction costs are estimated at 10bps round-trip. Actual costs vary, and market impact for less liquid names could be material.
* I have not tested for multiple comparison corrections across all the sub-analyses. Some of the sector/combination results would likely lose significance under Bonferroni.
**So what?**
The actionable takeaway: insider buying is a short-term filing-reaction trade. The signal is strongest when multiple insiders buy within the same week, in healthcare/biotech, and decays almost completely by day 30. If you're monitoring insider activity for long-term investment theses, this data suggests the filing event itself isn't giving you durable alpha.
The cluster finding is the most practically useful — it's a meaningfully different signal from single insider buys, and it persists longer. If I were building a systematic screen based on this data, the cluster filter would be the first thing I'd implement.
I wrote up the full methodology with interactive charts on my site if anyone wants the deeper dive — link is in my profile.
Happy to discuss methodology, share more granular results, or hear where this analysis might be wrong.