Survivorship bias: invisible in the data

A research analyst backtests a value-investing strategy on S&P 500 companies from 1990 to 2020. The strategy identifies cheap companies (low price-to-book, high earnings yield) and shorts expensive companies. The historical backtest shows a Sharpe ratio of 1.8 and annual alpha of 6%. The analyst publishes this result with confidence: "This strategy has generated superior risk-adjusted returns for thirty years."

The hidden problem: the backtest was run on companies that survived to 2020. Companies that went bankrupt, were delisted, or were acquired below book value never made it into the analysis. The cheap companies that were cheap because they were genuinely deteriorating, not undervalued, are missing from the dataset. The expensive companies that were expensive because they had genuine moats and earned excess returns are over-represented.

The backtest is biased because the dataset is biased. This is survivorship bias: the systematic exclusion of companies (or funds, or strategies) that failed, creating the illusion that past performance was better than it actually was.

Quick definition

Survivorship bias is a statistical error where only successful, extant entities are included in the analysis, while failed entities are excluded. In equity analysis, it means backtests are run on companies that survived the period studied, ignoring the delisted, bankrupt, or acquired companies. The result is an overestimate of historical returns and an underestimate of risk.

Key takeaways

The dead are invisible: Companies that went bankrupt are missing from your dataset, along with their catastrophic returns. The average return appears higher because the worst outcomes are unobserved.
Delisting is deletion: When a stock is delisted (often due to bankruptcy or poor performance), it is removed from standard indices. The analyst who backtests on the index sees no record of the delisting; the company simply disappears.
Acquisitions complicate the narrative: When a company is acquired below book value, the acquirer's shareholders bear the loss. But in the backtest, the acquisition price is often treated as the "exit" return, ignoring the acquirer's loss.
Biased datasets breed false patterns: If your dataset excludes 30% of the worst-performing companies, the correlations and patterns you discover are artifacts of the sample, not universal truths.
The further back you go, the worse it is: A 50-year backtest has more survivorship bias than a 5-year test because more companies have exited via failure or acquisition.

Why survivorship bias is invisible

The bias is insidious because there is no obvious error in the methodology. An analyst uses standard data sources (historical price data, financial statements), applies a sensible strategy, and calculates returns. The process is mathematically sound. The data is the problem.

Standard financial databases (like CRSP, Yahoo Finance, or Bloomberg) have survivorship bias built in. When a company delists, the data often stops. The stock is no longer in the index; it is no longer easy to access. An analyst who downloads historical price data for "all S&P 500 companies" from Yahoo Finance gets current constituents and their historical prices—but not the historical performance of companies that were once in the index but are now gone.

This is not deception; it is architecture. The database is designed to serve current investors tracking current holdings. It is not designed to serve researchers asking "what were the true returns for all companies that existed in 1990?"

The result: a systematic bias that makes the past look rosier than it was.

The magnitude of the bias

Academic research has quantified the survivorship bias in various datasets. Blume, Keim, and Patashnick (1991) found that if you include delisted companies in historical backtests, the average annual return of U.S. stocks from 1926 to 1960 was 4.4%, not the 9.7% that appears if you only include survivors. The bias is not small—it is more than half of the measured return.

The bias is smaller in recent decades because the U.S. market is now mature and has well-developed bankruptcy procedures. More of the "exits" are acquisitions at reasonable valuations, not catastrophic bankruptcies. But the bias still exists.

For developing markets, the bias is even larger. A backtest of emerging-market stocks from 1990 to 2020 that only includes companies still trading in 2020 will have severe survivorship bias, because many emerging-market companies have failed or delisted during that period.

Delisting as a return event

When a company is delisted, what return does the investor in that stock realize? The answer depends on why it was delisted.

If delisted due to bankruptcy, the return is often close to -100%. Shareholders are wiped out. If delisted due to acquisition, the return is the acquisition price relative to the stock price before the deal was announced. If delisted due to going private, it is similar to an acquisition.

Many backtests treat the delisting return as "unknown" and simply omit the stock from that point forward. This is a critical error. The return is not unknown; the return is negative. By omitting it, the backtest overstates the strategy's historical performance.

Some researchers add a delisting return (e.g., -30% for bankruptcy) to approximate the loss. But this is still an approximation. The true loss for a company that declared bankruptcy varies widely depending on creditors' seniority and asset recovery.

The key point: if your backtest does not include delisting returns, it is biased high. You are measuring returns for a sample that survived, not returns for all entities that existed.

Selection bias in analyst research

Beyond backtesting, survivorship bias affects the analyst's view of what works and what does not.

A portfolio manager reviews the performance of value investors. She observes that value investors who focus on low price-to-book, high earnings yield stocks have generated strong returns for decades. She concludes that value investing works. But her sample is biased: she is looking at value investors whose strategies worked well enough that they survived and are still actively managing money. The value investors whose strategies failed are no longer in business; they have closed their funds. Her sample excludes the failures.

This is the mutual-fund graveyard problem. Many active-management studies show that active managers outperform passive indices. But these studies often only include funds that survived to the end of the measurement period. Funds that underperformed and were shut down are missing. If you include them, the active-manager outperformance often shrinks to zero or becomes negative.

Similarly, an analyst might study the characteristics of outperforming stocks (high ROE, low debt, strong momentum) and conclude that these are drivers of returns. But the sample of outperforming stocks is a survivorship-biased sample. It excludes the stocks that had high ROE, low debt, and strong momentum but still declined. Those stocks are missing from the comparison because they underperformed.

How to adjust for survivorship bias

The most rigorous approach is to include delisted companies and their returns in all backtests and analysis. This requires access to a comprehensive database that includes delisted stocks. Academic researchers use CRSP; investment professionals might use FactSet or Compustat with delisting data.

An analyst who downloads data from Yahoo Finance or Bloomberg terminal cannot easily adjust for survivorship bias because the infrastructure does not support it. But an analyst who is aware of the bias can be more cautious in interpreting historical results. A backtest that shows 8% annual outperformance from a strategy is less credible if the backtest is on S&P 500 survivors than if it includes delisted companies.

Another approach is to use a different sample for validation. If a strategy is backtested on survivors, validate it on a smaller forward-looking sample where you can explicitly track delisted returns. Or use a different market (where the bias is understood to be different) to validate whether the pattern holds.

Real-world examples

Value investing in the late 1990s: Value investors noted that low price-to-book stocks had outperformed for decades. The historical data suggested that value was a reliable factor. But the backtest was run on survivors. During the late 1990s dot-com bubble, value investing dramatically underperformed because many of the cheap companies were value traps that eventually delisted. The historical outperformance was real, but it was overstated because the worst outcomes (delisted companies with -100% returns) were not in the historical dataset.

Emerging markets 1990–2000: Backtests of emerging-market strategies from 1990 onward look attractive if restricted to companies that survived to 2000. Many emerging-market companies delisted during the 1990s due to currency crises, instability, and poor governance. A comprehensive backtest that included those delistings would show much lower returns than a backtest restricted to survivors.

Tech stock momentum 2000–2005: A backtest of a momentum strategy applied to tech stocks from 1995 to 2005 might show strong returns if it only includes companies that survived. Many high-momentum tech stocks crashed during the dot-com bust. If the backtest stops (or omits) those stocks when they delisted, the measured returns of the momentum strategy appear better than the true experience of an investor who actually held those stocks.

Common mistakes

Mistake 1: Backtesting on index constituents without acknowledging the survivorship bias. An analyst backtests a value strategy on "all S&P 500 stocks from 1990 to 2020." This is actually "all S&P 500 stocks that survived to 2020." The bias is invisible in the methodology but real in the results.

Mistake 2: Comparing mutual fund performance without adjusting for fund closures. A study shows active managers outperform passive funds on average. But the comparison is only of funds that survived. Closed funds (which typically underperformed) are missing. Including closed funds would lower the apparent outperformance.

Mistake 3: Using delisting price as the final return without adjusting for bankruptcy losses. A company is delisted after filing for bankruptcy. The analyst records the final trading price as the return, rather than acknowledging that shareholders lost nearly everything. The return is not accurately captured.

Mistake 4: Analyzing what works without analyzing what fails. An analyst identifies characteristics of outperforming stocks (e.g., low P/E, high ROE, strong growth). But a more honest analysis would also ask: "Of all stocks with low P/E, high ROE, and strong growth, how many actually outperformed?" Many had those characteristics and still declined.

Mistake 5: Extending historical patterns without adjusting for the changing population of public companies. The population of public companies has changed dramatically over decades. Comparing the outperformance of value strategies from 1960 to 1990 versus 1990 to 2020 requires acknowledging that the mix of companies, sectors, and geographies is different. The survivorship bias is different too.

FAQ

Q: How much does survivorship bias typically overstate historical returns?

A: It varies widely. In mature U.S. markets over recent decades, the bias might add 0.5–1.5 percentage points to annual returns. In emerging markets or over longer periods, the bias can be 3–5 percentage points or more. The bias is larger for strategies that rely on picking the winners (because the losers are missing) and smaller for broad-market indices.

Q: If I can't access a delisting-adjusted database, should I ignore historical backtests?

A: Not entirely, but discount them. Be more skeptical of strategies that show large outperformance. Assume that the true outperformance is 20–30% lower than the backtest suggests. If the backtest shows 10% annual outperformance, assume 7–8% is more realistic.

Q: Does survivorship bias affect forward-looking projections or only historical backtests?

A: Primarily historical backtests. But it can affect forward projections if the projections are based on biased historical estimates. For example, if you estimate the value premium based on a biased backtest, you will overestimate the future value premium.

Q: How do I know if a historical backtest is biased?

A: Ask: (1) Does it include delisted companies and their returns? (2) Is it restricted to companies in an index at a specific date (like S&P 500 constituents as of 2020)? (3) Does it avoid the "look-ahead bias" where companies are only analyzed if they meet criteria that could only be known forward-looking? If you can't answer yes to (1) and no to (2) and (3), the backtest probably has survivorship bias.

Q: If I'm analyzing a single company (not a strategy), does survivorship bias matter?

A: Less directly, but it still affects your historical benchmarks. If you are comparing the company to "historical value-stock returns," that benchmark is biased high because it only includes value stocks that survived. Use adjusted benchmarks if possible.

Q: How do I adjust for survivorship bias in my own analysis?

A: Use delisting-adjusted data if available. Run your backtest on a sample that includes failed companies. Or, validate your historical findings on a smaller forward-looking sample where you explicitly track delisting returns. Or, reduce your confidence in historical patterns by 20–30% to account for the bias you can't directly measure.

Selection bias: The broader problem of a non-random sample creating misleading statistical conclusions.
Look-ahead bias: Using information that was not available at the time of the decision to build a backtest.
Market efficiency: The debate over whether historical patterns (value premium, momentum) are genuine market inefficiencies or artifacts of survivorship bias.
Mutual fund graveyard: The population of closed funds, which is invisible in published performance statistics but economically meaningful.
Delisting returns: The return an investor experiences when a stock is delisted, often catastrophic for bankruptcy cases.

Summary

Survivorship bias is a systematic error in historical analysis where failed companies are invisible, making the past appear better than it was. The bias is largest in long-horizon backtests, emerging-market data, and strategy analyses that depend on identifying winners.

Analysts and investors who are aware of survivorship bias can account for it: by using delisting-adjusted data, by validating historical patterns on forward-looking data, or by reducing confidence in historical patterns to account for the bias that cannot be measured. The analyst who ignores survivorship bias will overestimate historical returns and overfit to patterns that are partly artifacts of the sample.

The companies that failed are not in your database. The investors who are not in business are not in your fund-performance statistics. The patterns you observe in historical data are cleaner and more dramatic than the patterns that existed when investors could still lose everything.

Read the next article: The narrative fallacy in research.

Quick definition​

Key takeaways​

Why survivorship bias is invisible​

The magnitude of the bias​

Delisting as a return event​

Selection bias in analyst research​

How to adjust for survivorship bias​

Real-world examples​

Common mistakes​

FAQ​

Related concepts​

Summary​

Next​