### System Trading with Woodshedder

Joined Nov 11, 2007
1,458 Blog Posts

# More Thinking About Factors

I’m going to ask for some input from the blogosphere with this issue.

I am a firm believer that a simple system is more robust than a more complex system. This means that a simple system is less likely to be curve-fit and more likely to survive changing market regimes without a major breakdown.

What I’m am struggling with is touched on briefly in the previous post about a 4 Factor ETF System.

As I think more about factors, I think it may be important to make a distinction about the number of inputs required to derive the factor.

For example, if we use Rate of Change to rank ETFs, then we have one factor (Rank) with one input (n periods):

Rank=ROC(n periods).

What if we instead use two differing ROCs, and weight them? That would look like this:

Rank=ROC(n periods*weight)+ROC(n periods*weight)

While “Rank” is still the single factor, we now have multiple inputs required to make this factor.

What I’m pondering is this: Are inputs equal to factors?

If you enjoy the content at iBankCoin, please follow us on Twitter

1. Best way to figure that out is to run some simulations.
I suggest Genetic Algorithms to sort out the possible search space because your use of “n periods” denotes a vagueness as to which one is going to have the most impact.

• 0
• 0
• 0 Deem this to be "Fake News"
• That’s just it though Cuervo. I don’t want to optimize it. I want to be able to choose inputs that make sense and equate to market time frames like quarters, half-years, months, etc. For example, maybe the longer term measure is 90 trading days (1/4 year) and the shorter term measure is 22 days (1 month).

When I do optimize these types of thing, I just take the average of the top ten runs, and plug that in.

The key for me is to keep this simple and robust.

• 0
• 0
• 0 Deem this to be "Fake News"
2. well this is a sort of non-linear equation, an issue with many separate factors is that you tend to winnow the observation set down in the process of filtering. an example would be requiring interest rates to be 50 day moving average but also today is not a 20 day high. this is likely a rare occurence-and may or may not generalize out of sample.

whereas with a weighted roc you are not losing observations since nothing is mutually exclusive. however there is also another issue–if you are using a roc number vs a roc rank they are very different. a roc number is highly specific, whereas a rank is not. optimizing based on a roc number will lead to very specific and likely curve-fit results. doing the same process with a rank (especially say quantiles or quartiles), is very useful and works well out of sample. however the same thing can be accomplished with the ROC return as long as the final results are ranked. thus you are not requiring the equation to be greater or less than a specific number, but rather reflective of a distribution of numbers.

best
dv

• 0
• 0
• 0 Deem this to be "Fake News"
3. DV, perfect. Thank you.

I will be ranking everything. There will not be a specific threshold.

So then in this case, inputs would NOT be equal to factors? I think…

• 0
• 0
• 0 Deem this to be "Fake News"
4. Here’s my take – inputs do not equal factors. Suppose I built a model with three “factors” but each were a slightly different version of ROC or some other measurement, I would n’t have the robustness I was looking for.
Slightly different versions of one factor would be best combined into a single factor and not considered as separate factors

• 0
• 0
• 0 Deem this to be "Fake News"
5. Personally, I don’t consider inputs equivalent to factors in the sense that you are using them here.

Let’s say I decide to use the 20 EMA as trend indicator. I wouldn’t consider that indicator as having 20 factors because it has 20 inputs (or even 40 since each closing value is multipied by a value based on its recency). Or what about the MACD – would you consider the MACD as one factor (the indicator) or two (the indicators that make up the compound indicator)? Or along the same lines, what if you create your own proprietary indicator that has 200 inputs, is that one factor or 200? or some other combination?

In the end, I think you need to look at why you are trying to limit the factors and use that as your guide. If it is for robustness across all mkt types then your decision may be different than if you are trying to limit the complexity from an implementation standpoint. It comes down to what your objectives are for the system.

• 0
• 0
• 0 Deem this to be "Fake News"
• Mikey, really good points. Your writing about inputs helped solidify the difference for me. Also, enjoyed your thoughts,re: objectives for the system.

Complexity in terms of implementation is not a concern. I want the system to be robust across markets.

• 0
• 0
• 0 Deem this to be "Fake News"
6. Hey wood!
Using your definition for factors i don’t think inputs nor weights should be equal to factors. In each factor domain you should do your best to find the most robust method to reach your goal. If you need to use more inputs and more weights, so be it (i would agree you shouldn’t over do it).

BTW – I thought about your 4 factor post, and i think you need a 5th factor -> What to trade.
If you are building a system you want to survive a long time you can’t arbitrarily choose what to trade. You need some kind of rule – even if it’s simple as “most liquid” or if you’ll change it in the future.

• 0
• 0
• 0 Deem this to be "Fake News"
• Graves, agreud, on the 5th factor. Thanks for your thoughts!

• 0
• 0
• 0 Deem this to be "Fake News"
7. In the extreme, if we remove all factors to a singularity, we will only look at the current bar on its own and without context to predict the next. ðŸ˜‰

• 0
• 0
• 0 Deem this to be "Fake News"
8. Very good post, I truly look ahead to updates from you.

• 0
• 0
• 0 Deem this to be "Fake News"