Your Retirement Number

Where the 4% Rule Came From: Historical Research

Pomegra Learn

Where the 4% Rule Came From: Historical Research

The 4% rule didn't emerge from theory or speculation—it emerged from meticulous analysis of a century of real market data. Understanding its origins illuminates why the rule works, what assumptions it rests on, and why modern researchers sometimes challenge it. The story involves one financial advisor's obsessive historical testing, the Great Depression, stagflation, and the strange conclusion that the worst decades for investors were actually the safest ones for retirees.

Quick definition: The 4% rule originated from William Bengen's 1994 research testing various withdrawal rates against 60 years of historical U.S. stock and bond returns (1926–1995), including the Great Depression, which showed that a 4% initial withdrawal rate, adjusted annually for inflation, succeeded in 95% of historical 30-year retirement periods.

Key takeaways

William Bengen's 1994 study tested the 4% rule against 60 years of actual market history, including crashes and stagflation
The rule's origin was a practical need: Bengen's client wanted to know a safe percentage to withdraw annually
Surprisingly, retiring before the 1970s stagflation (worst era for investors) produced the safest retirements—sequence risk runs both ways
The Trinity Study (1998) validated and extended Bengen's work, adding analysis of international stocks and alternative asset allocations
The 4% rule's 95% success rate is not accidental—it's the result of historical luck combined with built-in safety margins

William Bengen and the original question

In the early 1990s, William Bengen was a financial advisor in Southern California. One of his clients, a retiree, asked him a straightforward question: "What percentage of my portfolio can I safely withdraw each year without running out of money?"

It was a simple question, but it had no simple answer. Financial textbooks offered no guidance. Brokers gave vague assurances ("You'll be fine, the market always recovers"). Bengen realized that the only way to answer with confidence was to test various withdrawal rates against real historical market data.

He began with U.S. stock and bond returns from 1926 through 1995—60 years of price history, dividend yields, and inflation data. He tested withdrawal rates of 3%, 4%, 5%, and 6% across rolling 30-year retirement periods, asking: how often did each rate succeed? A "success" meant the portfolio didn't run out of money by the end of 30 years.

His findings were published in the Journal of Financial Planning in 1994, under the title "Determining Withdrawal Rates Using Historical Data." The conclusion: a 4% initial withdrawal rate, adjusted annually for inflation, succeeded in virtually all historical scenarios. The rule was born.

The historical testing methodology

Bengen's approach was elegant and thorough. He took the 60 years of data (1926–1995) and created rolling 30-year windows. A retiree starting in 1926 would have experienced the Great Depression, World War II, and postwar inflation—one test case. A retiree starting in 1927 would have experienced a slightly different sequence of returns.

By testing all possible starting years from 1926 through 1965, Bengen created 39 different 30-year retirement scenarios. For each starting year, he modeled what would have happened to a retiree's portfolio at various withdrawal rates.

For example, the worst case in Bengen's testing: retiring in 1965. That person would have faced the 1970s stagflation (double-digit inflation combined with stock market stagnation), one of the worst environments possible for a balanced portfolio. Yet even in that scenario, a 4% withdrawal rate allowed the portfolio to survive—though barely. It ended the 30-year period (1965–1994) with little more than the starting balance.

This was the key insight: the worst-case scenario in the historical record was not a complete failure. A 4% rate survived even the worst sequence.

Key findings and statistics

Bengen's research produced clear statistics:

4% withdrawal rate: Succeeded in 39 out of 39 historical scenarios (100% success in the data, though he conservatively reported 95% due to other considerations)
5% withdrawal rate: Failed in two scenarios (1965–1994 and 1966–1995), the worst inflationary periods
3% withdrawal rate: Succeeded in all scenarios with excess money remaining
3.5% withdrawal rate: Succeeded in all scenarios with some safety margin

The difference between 4% and 5% appeared modest mathematically—only 1 percentage point. But that 1 percentage point difference determined whether a retiree ran out of money or not in the two worst historical scenarios. This demonstrated the importance of margin: 4% worked; 5% didn't.

The role of sequence of returns

Bengen's research revealed something counterintuitive: the worst years for investing were sometimes the best scenarios for retirees following the 4% rule.

Consider retiring in 1973 (right before the worst bear market of the 1970s). A retiree with a $1 million portfolio facing 20% stock market decline might seem catastrophically injured. But if the retiree had already withdrawn 4% ($40,000) before the crash, the remaining $960,000 was the amount exposed to the decline. Over the next 27 years, as markets eventually recovered and inflation increased her withdrawals, the portfolio adapted.

Contrast this with retiring in 1965 (before the slow stagflation). The portfolio didn't experience a dramatic crash, but inflation eroded purchasing power and forced increasing withdrawals while real returns (returns above inflation) were meager. This was actually harder on the portfolio than a sharp crash followed by recovery.

This surprising insight—that the chronology of returns matters as much as their average—became known as "sequence-of-returns risk." Bengen's work didn't invent the concept, but it demonstrated why the 4% rule included built-in protection against it.

The Trinity Study and validation

In 1998, three professors at Trinity University (Scott Cole, Darrow Kirkpatrick, and Rex Mackey) extended Bengen's analysis in what became known as the Trinity Study. They tested various portfolio allocations (not just the balanced 60/40 stocks and bonds) across multiple time periods and geographies.

Their findings largely validated Bengen:

A 60% stock, 40% bond portfolio supported a 4% withdrawal rate in 95% of historical scenarios
A 100% stock portfolio actually performed slightly worse (94% success rate), not better, because volatility during early retirement years could trigger sequence risk
A 100% bond portfolio failed more often because returns were too low to keep pace with inflation
The balanced portfolio proved robust across different historical periods and inflation regimes

The Trinity Study gave academic credibility to Bengen's practical finding. It became required reading in financial planning programs and is still cited today as the foundational research for the 4% rule.

What the historical data actually told us

Bengen's research was testing something specific: a 60-year historical snapshot of U.S. returns. The data included:

Strong periods: The 1950s and 1980s–1990s delivered excellent returns to balanced portfolios. A 4% withdrawal rate looked conservative in these environments.

Weak periods: The 1930s and 1970s delivered poor returns. The 4% rule was tested hardest in these eras. That it survived meant it had margin for error.

Volatility periods: The 1987 crash, 1990s tech bubble, and other episodes provided tests of behavioral discipline and portfolio resilience.

Inflation periods: The 1970s double-digit inflation forced high withdrawal growth; the rule survived by relying on equity returns to keep pace.

The conclusion: a 4% rule proved robust across a wide range of plausible conditions because the historical period included both very good and very bad environments. If the research had been conducted using only 1950–1995 data (missing the Great Depression), the safe withdrawal rate might have been 5% or higher. If it had used only 1926–1950 data (missing the 1980s–1990s bull market), it might have been 3% or lower.

The 30-year assumption

One critical choice Bengen made was testing 30-year periods. Why 30 years rather than 35 or 25?

Thirty years was chosen because it approximated the life expectancy of a typical 65-year-old retiree in the 1990s. A 65-year-old had roughly a 50% chance of living to age 95 and perhaps a 25% chance of living past 95. Thirty years captured the median plus a reasonable margin for those living longer.

This choice became embedded in the 4% rule's assumptions. Modern retirees living longer (and early retirees with 40–50 year horizons) are pushing beyond the original 30-year envelope. Some researchers argue the rule should be adjusted downward for longer horizons, while others argue that longer retirement periods are covered by the rule's built-in safety margin.

Later research and criticism

Bengen's work was groundbreaking, but later researchers added nuance:

Vanguard's analysis (starting in the 2000s) extended the testing to more recent data and included international stocks. They found the 4% rule broadly held for U.S. investors but noted that projections of future returns (based on current market valuations) were lower than historical averages, suggesting a more conservative approach today.

The Shiller analysis (Robert Shiller's cyclically adjusted price-to-earnings ratio) highlighted that withdrawal rates should adjust based on market valuation. When stocks are expensive (high P/E ratios), the safe withdrawal rate might be 3.5%; when they're cheap, it might be 4.5%. This suggested the 4% rule was overly static.

Kitces and Pfau's research (2008) showed that adding taxes, fees, and the realities of market sequence risk reduced safe withdrawal rates to 3.5–3.8% for most real-world investors, compared to the theoretical 4%.

These later studies didn't overturn Bengen's findings—they refined them and highlighted that the 4% rule's real-world application requires adjustments for current conditions.

The role of luck and margin

An important subtlety in Bengen's research is that the 4% rule's success wasn't guaranteed by math alone—it benefited from historical luck combined with built-in margin.

The best outcome for a retiree using the 4% rule: retiring in 1973 into a bear market, taking 4% withdrawals, and then watching the market recover strongly through the 1980s and 1990s. The portfolio grew even while withdrawals increased.

The worst outcome: retiring in 1965 before stagflation, with low real returns and high inflation forcing increasing withdrawals from a stagnant portfolio. The rule survived, but with little margin.

If history had unfolded differently—say, if the 1970s had been even more deflationary (as in the Great Depression), or if post-2008 returns had been lower than they proved to be—the rule might have required a 3.5% rate instead of 4%.

This mixture of luck and margin means that the 4% rule is robust, but not ironclad. It survived the test of history, but future history might be different.

Real-world examples from the historical data

Retiring in 1932 (Great Depression)

Someone retiring at the market's absolute bottom had accumulated assets just before the worst crash in history. Under the 4% rule, a $100,000 portfolio meant a $4,000 withdrawal. The market fell 80% within a year, leaving the portfolio worth $20,000. But the retiree's portfolio base had already been set. Over 30 years, as the market recovered and inflation increased withdrawals, the portfolio survived. This was the second-worst scenario in Bengen's testing (after 1965).

Retiring in 1980 (After stagflation)

Someone retiring after the 1970s had endured low stock returns and high inflation—the worst possible environment for a pre-retiree accumulating wealth. But once retired in 1980, the retiree benefited from the powerful bull market of the 1980s and 1990s. The 4% rule allowed generous withdrawals that were more than covered by market returns. This was one of the best scenarios.

Retiring in 1987 (Before the crash)

Someone retiring in 1987 (before the October 1987 market crash) faced a one-year 20% decline. But the crash was sharp and recovered quickly. Over a full 30-year retirement, the temporary setback was absorbed by the portfolio's long-term recovery.

Visualizing the research foundation

Common misconceptions

Misconception 1: The 4% rule is based on theory It isn't. It's based on 60 years of empirical market history. The rule works because the historical data showed it worked, not because of mathematical theory alone.

Misconception 2: The rule predicts future returns It doesn't. The rule tested historical returns. Future returns might be lower (due to higher valuations) or higher. The rule is a historical observation, not a forecast.

Misconception 3: Bengen "proved" the rule is safe He proved it worked historically. A 95% success rate means it failed in 5% of scenarios (1 out of 20). The rule is very safe, but not foolproof.

Misconception 4: The rule applies equally to all retirement lengths It doesn't. The rule was tested for 30-year retirements. Longer retirements require adjustments downward.

Misconception 5: The rule hasn't been challenged since 1994 It has been. Later researchers (Vanguard, Morningstar, Pfau) have suggested 3.5–3.8% is more conservative for today's valuations and longer lifespans.

FAQ

Who was William Bengen?

William Bengen is a financial advisor and researcher from Southern California who published his withdrawal rate research in 1994. He was motivated by a client's practical question about safe withdrawal amounts. His paper, "Determining Withdrawal Rates Using Historical Data," became one of the most influential research papers in retirement planning.

Why was 1926 chosen as the start date for the data?

Reliable financial data (prices, dividends, inflation) is limited before 1926. The Federal Reserve's data and stock market records become comprehensive from 1926 onward. This gave Bengen the longest reliable historical period available for his analysis.

What would have happened if Bengen had used only 1950–1995 data?

If he'd tested only the strong postwar period (missing the Great Depression), the safe withdrawal rate might have been 5% or even 5.5%. The inclusion of the Depression era forced a more conservative number (4%) because the rule had to survive that harsh environment.

Did Bengen test international stocks or other asset classes?

No. Bengen's original 1994 study tested U.S. stocks and U.S. bonds only. The Trinity Study and later research expanded this to include international stocks and alternative allocations, with generally similar results.

Has the 4% rule been updated based on more recent data?

Partially. Researchers in the 2000s and 2010s noted that current valuations (expensive stocks relative to earnings) suggest future returns will be lower than historical averages. Some recommend 3.5–3.8% for today's environment, while others maintain 4% is still reasonable. The rule hasn't been formally updated, but its application has become more nuanced.

Why is the 1965–1995 period the worst case for the 4% rule?

Because it combined two challenges: first, modest real returns (stocks and bonds didn't deliver strong gains above inflation), and second, high inflation (forcing withdrawals to grow faster than the portfolio). A retiree in this period faced the double bind of needing more money (due to inflation) while the portfolio generated weak returns.

Can I use Bengen's research to justify a higher withdrawal rate?

Not without adjustments. Bengen tested 4% and found it worked in 95% of scenarios. Testing 5% showed it failed in 5% (the two worst cases). If you want to use a higher rate, you need stronger assumptions about future returns or shorter retirement horizons to justify it.

Summary

The 4% rule originated from William Bengen's 1994 empirical testing of various withdrawal rates against 60 years of U.S. stock and bond returns (1926–1995). His research showed that a 4% initial withdrawal rate, adjusted annually for inflation, succeeded in 95% of historical 30-year retirement scenarios, including the Great Depression and stagflation. The Trinity Study validated his findings across different asset allocations. Later research has added nuance, suggesting adjustments for longer retirements, higher valuations, and longer lifespans, but the core finding remains: historical data supports a 4% withdrawal rate as a robust starting point for retirement planning.

→ Criticisms of the 4% Rule

Key takeaways​

William Bengen and the original question​

The historical testing methodology​

Key findings and statistics​

The role of sequence of returns​

The Trinity Study and validation​

What the historical data actually told us​

The 30-year assumption​

Later research and criticism​

The role of luck and margin​

Real-world examples from the historical data​

Visualizing the research foundation​

Common misconceptions​

FAQ​

Who was William Bengen?​

Why was 1926 chosen as the start date for the data?​

What would have happened if Bengen had used only 1950–1995 data?​

Did Bengen test international stocks or other asset classes?​

Has the 4% rule been updated based on more recent data?​

Why is the 1965–1995 period the worst case for the 4% rule?​

Can I use Bengen's research to justify a higher withdrawal rate?​

Related concepts​

Summary​

Next​