More Thinking About Factors

Woodshedder Sat Jan 30, 2010 5:46pm EST 13 Comments

I’m going to ask for some input from the blogosphere with this issue.

I am a firm believer that a simple system is more robust than a more complex system. This means that a simple system is less likely to be curve-fit and more likely to survive changing market regimes without a major breakdown.

What I’m am struggling with is touched on briefly in the previous post about a 4 Factor ETF System.

As I think more about factors, I think it may be important to make a distinction about the number of inputs required to derive the factor.

For example, if we use Rate of Change to rank ETFs, then we have one factor (Rank) with one input (n periods):

Rank=ROC(n periods).

What if we instead use two differing ROCs, and weight them? That would look like this:

Rank=ROC(n periods*weight)+ROC(n periods*weight)

While “Rank” is still the single factor, we now have multiple inputs required to make this factor.

What I’m pondering is this: Are inputs equal to factors?

If you enjoy the content at iBankCoin, please follow us on Twitter

13 comments

Cuervo's Laugh
January 30, 2010 at 10:03 pm

Best way to figure that out is to run some simulations.
I suggest Genetic Algorithms to sort out the possible search space because your use of “n periods” denotes a vagueness as to which one is going to have the most impact.
- 0
- 0
- 0 Deem this to be "Fake News"
- Woodshedder
  January 31, 2010 at 3:55 pm
  
  That’s just it though Cuervo. I don’t want to optimize it. I want to be able to choose inputs that make sense and equate to market time frames like quarters, half-years, months, etc. For example, maybe the longer term measure is 90 trading days (1/4 year) and the shorter term measure is 22 days (1 month).
  
  When I do optimize these types of thing, I just take the average of the top ten runs, and plug that in.
  
  The key for me is to keep this simple and robust.
  - 0
  - 0
  - 0 Deem this to be "Fake News"
david
January 31, 2010 at 2:58 am

well this is a sort of non-linear equation, an issue with many separate factors is that you tend to winnow the observation set down in the process of filtering. an example would be requiring interest rates to be 50 day moving average but also today is not a 20 day high. this is likely a rare occurence-and may or may not generalize out of sample.

whereas with a weighted roc you are not losing observations since nothing is mutually exclusive. however there is also another issue–if you are using a roc number vs a roc rank they are very different. a roc number is highly specific, whereas a rank is not. optimizing based on a roc number will lead to very specific and likely curve-fit results. doing the same process with a rank (especially say quantiles or quartiles), is very useful and works well out of sample. however the same thing can be accomplished with the ROC return as long as the final results are ranked. thus you are not requiring the equation to be greater or less than a specific number, but rather reflective of a distribution of numbers.

best
dv
- 0
- 0
- 0 Deem this to be "Fake News"
Woodshedder
January 31, 2010 at 4:02 pm

DV, perfect. Thank you.

I will be ranking everything. There will not be a specific threshold.

So then in this case, inputs would NOT be equal to factors? I think…
- 0
- 0
- 0 Deem this to be "Fake News"
Aristotle
February 1, 2010 at 10:02 am

Here’s my take – inputs do not equal factors. Suppose I built a model with three “factors” but each were a slightly different version of ROC or some other measurement, I would n’t have the robustness I was looking for.
Slightly different versions of one factor would be best combined into a single factor and not considered as separate factors
- 0
- 0
- 0 Deem this to be "Fake News"
MikeyTrades
February 2, 2010 at 1:37 pm

Personally, I don’t consider inputs equivalent to factors in the sense that you are using them here.

Let’s say I decide to use the 20 EMA as trend indicator. I wouldn’t consider that indicator as having 20 factors because it has 20 inputs (or even 40 since each closing value is multipied by a value based on its recency). Or what about the MACD – would you consider the MACD as one factor (the indicator) or two (the indicators that make up the compound indicator)? Or along the same lines, what if you create your own proprietary indicator that has 200 inputs, is that one factor or 200? or some other combination?

In the end, I think you need to look at why you are trying to limit the factors and use that as your guide. If it is for robustness across all mkt types then your decision may be different than if you are trying to limit the complexity from an implementation standpoint. It comes down to what your objectives are for the system.
- 0
- 0
- 0 Deem this to be "Fake News"
- Woodshedder
  February 3, 2010 at 12:39 am
  
  Mikey, really good points. Your writing about inputs helped solidify the difference for me. Also, enjoyed your thoughts,re: objectives for the system.
  
  Complexity in terms of implementation is not a concern. I want the system to be robust across markets.
  - 0
  - 0
  - 0 Deem this to be "Fake News"
Graves
February 2, 2010 at 3:35 pm

Hey wood!
Using your definition for factors i don’t think inputs nor weights should be equal to factors. In each factor domain you should do your best to find the most robust method to reach your goal. If you need to use more inputs and more weights, so be it (i would agree you shouldn’t over do it).

BTW – I thought about your 4 factor post, and i think you need a 5th factor -> What to trade.
If you are building a system you want to survive a long time you can’t arbitrarily choose what to trade. You need some kind of rule – even if it’s simple as “most liquid” or if you’ll change it in the future.
- 0
- 0
- 0 Deem this to be "Fake News"
- Woodshedder
  February 3, 2010 at 12:36 am
  
  Graves, agreud, on the 5th factor. Thanks for your thoughts!
  - 0
  - 0
  - 0 Deem this to be "Fake News"
Jeff
February 2, 2010 at 4:38 pm

In the extreme, if we remove all factors to a singularity, we will only look at the current bar on its own and without context to predict the next. 😉
- 0
- 0
- 0 Deem this to be "Fake News"
Mathew
February 18, 2012 at 9:34 pm

Very good post, I truly look ahead to updates from you.
- 0
- 0
- 0 Deem this to be "Fake News"

MARKETS COLLAPSE UNDER THE WEIGHT OF ITS OWN HUBRIS

The Semis are the Tell

Struggle Session

System Trading with Woodshedder

More Thinking About Factors

Related Articles

13 comments

MARKETS COLLAPSE UNDER THE WEIGHT OF ITS OWN HUBRIS

The Semis are the Tell

Struggle Session

System Trading with Woodshedder

More Thinking About Factors

Related Articles

Backtesting the SPY 10 / 100 System

Three Higher Closes in a Row: Bulls Beware

No Wonder 95% of All Traders Blow Up

13 comments