Avoiding Survivorship Bias: De-Listed Data is a Must

633 views

I have written before about survivorship bias. Frank over at Engineering Returns has developed a survivor-free S&P500 database and further demonstrates the impact of survivorship on a simple RSI2 trading system.

Plainly speaking, anyone backtesting and not using de-listed data is going to have results that are not accurate.

While I have not tested this idea sufficiently enough for it to be more than a theory, I theorize that a short-term system which holds stocks for a few days to a week or so may indeed show improvement when using a survivor-free database. (However, Frank’s recent testing shows my theory may be incorrect.) Conversely, systems that are of the trend-following variety (and usually hold stocks for longer periods of time) seem to suffer worse when de-listed data is used. For a good example of this, see my started, but not-yet-finished-series on building a momentum rotational system. I was shocked at how much performance was degraded when adding de-listed data, to the extent that it stopped the series in its tracks.

While it should be obvious how survivorship bias will affect trading systems that trade a portfolio of stocks, what may not be as obvious is that it will also affect indicators. Specifically, an indicator like a breadth indicator, which uses the data from hundreds or thousands of stocks, is going to be affected by survivorship bias. If such an indicator is applied to a trading system which was developed with de-listed data, the impact of survivorship bias is compounded.

Inevitably, this type of post leads people to ask where I get my data. I use Norgate’s Premium Data, which offers a de-listed data base for a very low one-time fee.

Testing End-of-Month Markup with a Survivor-Free NDX Database

148 views

Kick off post: Is the End-of-Month Markup a Myth?

Next post: End-of-Month Markup and the Nasdaq 100

And now we will examine the first caveat of the last post, where I noted that survivorship bias may affect performance.

Indeed, the previous test suffered from survivorship bias.

Frank Hassler from Engineering Returns contacted me and generously offered a survivor-free Nasdaq 100 database with which to test this idea. If you haven’t been checking out Frank’s blog, you are missing out on some fresh ideas as well as the code behind them. I mean really, the ideas are money. The fact that the code is also provided is just icing on the cake. Truly, he is one of the good guys in the whole system-trading/backtesting blogosphere.

Understand that Frank’s database replicates the additions and deletions of stocks on the Nasdaq 100 as they would have actually happened in real-time. This is much different from testing over the current 100 stocks of the NDX as many of these 100 stocks would not have belonged to the index earlier in the decade.

Do not underestimate the effect of survivorship bias. Many traders read about it but have no way of actually seeing it or understanding the implications.

What is being fleshed out here is a real-life lesson on the impact of survivorship bias on a trading system.

The Rules:

(Same as before but re-posted for clarity)

  • Using the Nasdaq 100, select the 5 stocks with the highest rate-of-change over the last 252 bars (roughly one calendar year) and buy them at the close 5 days before the month’s end. Sell on the close of the 1st day of the next month.
  • I’m testing from 1/1/2001 to 10/8/2010. I’ve not included any commissions or slippage.

The Results:

Summary:

Compare the original results without a survivor free database to the results above.

There is quite a difference, and it is due to survivorship bias.

If we were to include commissions, the system would be only marginally profitable, while the original results looked very promising with an annual return of 15.80%.

The good news is that now we can accurately answer Steve Place’s original question- Is the month-end markup on the Nasdaq 100 a myth?

Stay tuned, there is more to come.

Read the next post in this series: Which is the Best Day of the Month to Buy the QQQQ?

Avoiding Survivorship Bias: De-Listed Data is a Must

633 views

I have written before about survivorship bias. Frank over at Engineering Returns has developed a survivor-free S&P500 database and further demonstrates the impact of survivorship on a simple RSI2 trading system.

Plainly speaking, anyone backtesting and not using de-listed data is going to have results that are not accurate.

While I have not tested this idea sufficiently enough for it to be more than a theory, I theorize that a short-term system which holds stocks for a few days to a week or so may indeed show improvement when using a survivor-free database. (However, Frank’s recent testing shows my theory may be incorrect.) Conversely, systems that are of the trend-following variety (and usually hold stocks for longer periods of time) seem to suffer worse when de-listed data is used. For a good example of this, see my started, but not-yet-finished-series on building a momentum rotational system. I was shocked at how much performance was degraded when adding de-listed data, to the extent that it stopped the series in its tracks.

While it should be obvious how survivorship bias will affect trading systems that trade a portfolio of stocks, what may not be as obvious is that it will also affect indicators. Specifically, an indicator like a breadth indicator, which uses the data from hundreds or thousands of stocks, is going to be affected by survivorship bias. If such an indicator is applied to a trading system which was developed with de-listed data, the impact of survivorship bias is compounded.

Inevitably, this type of post leads people to ask where I get my data. I use Norgate’s Premium Data, which offers a de-listed data base for a very low one-time fee.

Testing End-of-Month Markup with a Survivor-Free NDX Database

148 views

Kick off post: Is the End-of-Month Markup a Myth?

Next post: End-of-Month Markup and the Nasdaq 100

And now we will examine the first caveat of the last post, where I noted that survivorship bias may affect performance.

Indeed, the previous test suffered from survivorship bias.

Frank Hassler from Engineering Returns contacted me and generously offered a survivor-free Nasdaq 100 database with which to test this idea. If you haven’t been checking out Frank’s blog, you are missing out on some fresh ideas as well as the code behind them. I mean really, the ideas are money. The fact that the code is also provided is just icing on the cake. Truly, he is one of the good guys in the whole system-trading/backtesting blogosphere.

Understand that Frank’s database replicates the additions and deletions of stocks on the Nasdaq 100 as they would have actually happened in real-time. This is much different from testing over the current 100 stocks of the NDX as many of these 100 stocks would not have belonged to the index earlier in the decade.

Do not underestimate the effect of survivorship bias. Many traders read about it but have no way of actually seeing it or understanding the implications.

What is being fleshed out here is a real-life lesson on the impact of survivorship bias on a trading system.

The Rules:

(Same as before but re-posted for clarity)

  • Using the Nasdaq 100, select the 5 stocks with the highest rate-of-change over the last 252 bars (roughly one calendar year) and buy them at the close 5 days before the month’s end. Sell on the close of the 1st day of the next month.
  • I’m testing from 1/1/2001 to 10/8/2010. I’ve not included any commissions or slippage.

The Results:

Summary:

Compare the original results without a survivor free database to the results above.

There is quite a difference, and it is due to survivorship bias.

If we were to include commissions, the system would be only marginally profitable, while the original results looked very promising with an annual return of 15.80%.

The good news is that now we can accurately answer Steve Place’s original question- Is the month-end markup on the Nasdaq 100 a myth?

Stay tuned, there is more to come.

Read the next post in this series: Which is the Best Day of the Month to Buy the QQQQ?