Skip to main content
Backtesting

Survivorship Bias Explained

Pomegra Learn

What Is Survivorship Bias and Why Does It Kill Your Backtest?

When you download historical stock data to backtest a strategy, you're downloading data on stocks that are still alive. Companies that went bankrupt, were delisted, or merged out of existence are usually missing from the dataset. This sounds like a minor detail until you realize that your strategy probably shorted, went long, or held many stocks that are now dead. Survivorship bias is the systematic tendency to test strategies only on winners—the stocks that survived. It's one of the most dangerous and invisible biases in backtesting, and it inflates returns by 0.5–3% per year.

Quick definition: Survivorship bias is the statistical error of testing a strategy only on securities that survived to the present day, excluding stocks that delisted, went bankrupt, or were merged during the historical period.

Key takeaways

  • Dead stocks are missing from most free data. Yahoo Finance, Google Finance, and many popular data sources don't include delisted companies, creating an automatic upward bias.
  • The bias is invisible but large. Excluding failed stocks can overstate returns by 1–3% annually, depending on the historical period and market sector.
  • Bankruptcies and delistings are not random. Distressed stocks often show patterns (falling price, rising volatility, negative news) that a momentum or growth strategy might flag as buy signals—right before they collapse.
  • Survivorship bias is worst in small-cap and value strategies. Large-cap strategies see fewer delistings; small-cap and value-focused strategies see more.
  • High-quality backtest data includes the dead. Professional-grade datasets from Compustat, CRSP, and proprietary vendors include delisted securities with their actual delisting dates and prices.

How survivorship bias inflates your backtest returns

Imagine a simple strategy: "Buy stocks with the lowest price-to-book ratio and hold for one year." You backtest this on S&P 500 historical data from 2000 to 2020.

Your backtest shows 12% annualized return. But here's the problem: the S&P 500 that existed in 2000 is not the same S&P 500 that exists today. About 400 of the original 500 companies have been replaced due to bankruptcy, merger, or poor performance. Your backtest excluded all of them.

Many of those replaced companies had low price-to-book ratios because they were dying. A stock trading at 0.3× book value isn't a bargain; it's a bankruptcy waiting to happen. Lehman Brothers, General Motors, Bear Stearns, and Circuit City all traded at depressed price-to-book ratios—right before going to zero.

If your backtest includes these delisted stocks and models them correctly (buying them at market price, holding as they collapsed, and selling at delisting price or zero), your 12% return might drop to 10% or even 8%. That 2–4% difference is survivorship bias.

The worse news: you won't know your backtest has this problem unless you deliberately fix it. Free data sources won't tell you that Company X is missing from the dataset. You'll see a 12% return and assume your strategy is good.

Which asset classes suffer most from survivorship bias?

Stock strategies: Severe. Small-cap and value strategies see the most bias because distressed, cheap companies are more likely to delist. Large-cap strategies see less bias because large companies have lower bankruptcy risk.

Sector strategies: Severe for energy, materials, and regional banks, which have high delisting rates. Utilities and consumer staples see fewer delistings.

Index strategies: Moderate. If you're buying the S&P 500 itself, you're buying the current 500 companies, not the original 500. But the index rebalances in real time, so you'd have been holding the companies that got kicked out.

Dividend strategies: Moderate. High-dividend stocks sometimes get cut or eliminated due to financial distress. But most dividend payers survive.

Momentum and trend strategies: Low to moderate. These strategies often sell declining stocks, so they exit before bankruptcy. But they can be long a stock that delists during your hold period.

Forex, commodities, and crypto: Minimal. These assets don't delist. (A crypto exchange can shut down or a commodity contract can expire, but that's different.)

How to fix survivorship bias in your backtest

Use professional-grade data. If your data is free, it likely has survivorship bias. Compustat, CRSP (through WRDS at universities), and Bloomberg Terminal include delisted securities with exact delisting dates and final prices. FactSet, Reuters, and AlternativeData also cover dead companies. This costs money, but it's the only way to get accurate results.

Get the actual delisting dates and prices. If you use Compustat or CRSP, you get a field marking whether each company is delisted and when. Yahoo Finance's "Delisted Securities" list is free but incomplete and disorganized. For serious backtesting, you need machine-readable delisting dates.

Model delistings correctly. When a stock delists:

  • If it's a merger or acquisition, use the offer price or final trading price.
  • If it's a bankruptcy, use the final trading price (often $0.01–$0.50 per share), or research the bankruptcy payout.
  • If it's a reverse split or trading halt, use the last known price.

Your backtest should exit the position automatically on the delisting date at the appropriate price.

Adjust your sample period. If you're backtesting on survivorship-biased data, you can reduce the bias slightly by testing on shorter periods where fewer companies have delisted. A 5-year backtest has less survivorship bias than a 20-year backtest. But this doesn't eliminate the problem; it just reduces it.

Add stocks that are about to delist. If you're backtesting from 2010, your dataset probably includes bankrupt or distressed companies that delisted after 2010. Manually add their data to your test set if they were trading in the 2010 period and your strategy would have caught them.

Test sensitivity to delisting assumptions. Run your backtest twice: once assuming delisted stocks exit at 50% of final quoted price (conservative) and once assuming they exit at 10% of final price (realistic for bankruptcies). If your strategy's return is sensitive to this, you have a delisting problem.

Real-world example: Survivorship bias in a small-cap value strategy

Backtest A (Yahoo Finance data, 2000–2020):

  • Universe: 1,000 smallest stocks on the NASDAQ at the start of each year
  • Strategy: Buy the 50 with the lowest price-to-book ratio
  • Result: 14% annualized return, 0.8 Sharpe ratio

Backtest B (Compustat data, 2000–2020, including delisted stocks):

  • Universe: 1,000 smallest stocks on the NASDAQ at the start of each year (including companies that delisted during the period)
  • Strategy: Same (buy 50 lowest P/B)
  • Result: 10% annualized return, 0.5 Sharpe ratio

The difference is 4 percentage points—entirely due to survivorship bias. The strategy bought distressed stocks that looked cheap because they were about to go under. The Compustat data included their delisting at near-zero prices. Yahoo Finance data never included them at all, creating the illusion of outperformance.

Which backtest is real? The Compustat one. When you trade this strategy live in the future, you'll see it short-punished by buying distressed value names, just like the Compustat backtest showed.

Decision tree

Common mistakes

Assuming free data is delisting-complete. Yahoo Finance is convenient, but it's not designed for serious backtesting. It's designed for charting. Missing delistings are not a bug; they're by design.

Testing only on current index constituents. The S&P 500 today is not the S&P 500 of 2000. If you want to test on "S&P 500 from 2000," you need the actual 500 stocks from that year, including the ones that are now dead.

Ignoring bankruptcies in your strategy output. If your backtest shows you held a stock from $50 to $0 (bankruptcy), that's a -100% return on that trade. But some backtesting platforms might drop the stock from the analysis or treat it as missing data. Verify your platform handles bankruptcies explicitly.

Thinking survivorship bias only affects small-cap stocks. Large-cap stocks delist too—Lehman, GM, AT&T spin-offs. But large-cap strategies are less affected because their selection criteria (quality, dividend, momentum) tend to exclude distressed names.

Forgetting sector survivorship bias. Tech stocks of 2000 had severe delistings. Energy stocks of 2014–2016 had severe delistings. If you're testing sector-specific strategies, the delisting bias is huge.

FAQ

How much does survivorship bias hurt returns on average?

Studies show 0.5–2% per year for broad stock strategies, and 2–4% per year for small-cap or value strategies. This is one of the largest sources of backtest overstatement.

Can I detect survivorship bias in my backtest?

Yes. Compare your results using free data (Yahoo Finance) to results using professional data (Compustat, CRSP). If free data shows significantly higher returns, survivorship bias is present.

What if I can't afford professional data?

For academic research, WRDS (at universities) and Ken French's data library provide free access to CRSP and Compustat. For individual traders, databases like EOD Historical Data and Norgate Data offer survivorship-bias-corrected data at lower cost than Bloomberg or FactSet.

Does survivorship bias affect crypto strategies?

No. Cryptocurrencies don't delist in the same way. A crypto exchange can shut down (like Mt. Gox), but you'll know about it. If you're testing on major exchanges, survivorship bias is not a concern.

What if I'm testing a strategy that avoids distressed stocks?

Your strategy is actually less affected by survivorship bias because you're not buying the distressed companies that delist. A momentum or quality strategy that exits declining stocks avoids the delisting problem. A deep-value strategy that buys the cheapest stocks is hit hardest.

Should I adjust for delistings in real-time trading?

Yes. When a stock is delisted or in danger of delisting, you should exit immediately or close to market price. The backtest that includes delisting prices trains you to expect this loss.

Summary

Survivorship bias is the invisible tax on stock backtests. It occurs because historical datasets are missing companies that delisted, went bankrupt, or were merged. A strategy that looks like a 12% annualized winner on free data might be a 10% winner once you include the dead companies it would have bought or held. The bias is worst for small-cap and value strategies, which naturally select distressed companies that are likely to delist. The only way to fix survivorship bias is to use professional-grade data that includes delisting information, or to carefully adjust your tests for known delistings. A backtest that ignores dead companies is a backtest on ghosts—a historically inaccurate phantom that bears no resemblance to your future trading results.

Next

Walk-Forward Testing for Realistic Results