Skip to main content

The Siren of Statistics

A siren was a mythological being who lured sailors with their enchanting music to shipwreck on the rocky coasts of their island. Their songs were almost impossible to resist. But more generally a “siren” is a bad thing that we are attracted to, either physically or psychologically.

For investors, an example of a siren’s song is simplicity. Many investors are prone to looking for just a few metrics to evaluate a fund or a strategy. This is a problem. It is a problem for me because my answers to reasonable sounding questions will be incomplete. “What is your Sharpe ratio?” sounds reasonable enough. But the answer on its own is close to meaningless without a much more detailed elaboration. What is the sampling error around the point estimate? How constant has it been across sub-periods? How distorted is it due to the shape of the full return distribution?
But, more importantly, it is a problem for the investor because the simple answer gives a false sense of certainty.

Is the Sharpe ratio meaningless? No. Far from it. It measures exactly what it says it does: the ratio of excess returns to volatility. But both parts of the ratio are incomplete measure of profitability and risk respectively, they both has estimation errors, and a ratio of uncertain variable has statistical issues of its own. Clearly the two return streams drawn below are far from the same, but the Sharpe ratio cannot tell them apart.

To be clear, the problem is not due to statistical and mathematical issues. The problem is our desire for a simple answer to a complex question.

Here is another financial example with no mathematical complexity at all: net worth. Net worth is defined by the value of a person’s assets minus her liabilities. Lets consider two people.
Tom is 65 and recently retired. He is single with no dependents. He lives in a house that he owns and is fully paid off. He has no car, credit card debt or any other loans. The house is worth $400k and he also has $800k in retirement savings. His net worth is $1.2 million.

Rachel is 36, married with 2 children and works as a programmer. Her husband is an engineer, but he is currently at home looking after the kids. She and her husband own a house which is worth $1.2 million on which they have a $600k mortgage. She has a car which is worth $50k and she still owes $30k on it. Between the two of them, they have retirement savings of $600k and an early investment in a friend’s start up has given them $200k as a windfall profit. They have $50k in a bank account, $250k in student loans and another $20k of credit card debt. Their net worth is $1.2 million.

Here are two families with equal net worths.  Are they in the same financial situation?  No. Not even close. And there is still a lot more we don’t know: incomes, expenditures, medical issues or future employment prospects. The number is almost meaningless. Practically all single measures of anything complex have troubles like this. To expect otherwise would be like an aeroplane pilot wanting only one gauge.

I use a lot of different measures to quantify risk. A non-exhaustive list is: volatility, downside volatility, drawdown, Sharpe, Sortino, margin utilization, PL persistence, average PL, PL/theoretical PL, delta, gamma, vega, theta and duration. All have weakness. When you look at each metric individually, it tells you something specific, because each metric is measuring one particular thing, buttogether they tell me a lot.

When you try to look across all metrics at once and come up with a single, composite ranking…..Well, let’s just say it’s tricky.  The results can lead you astray if not designed well.  One attempt to do this was VaR (value at risk), but twenty years after its development people still misunderstand it, misuse it and over-rely on it.

If you are a full-time investor, you can take my approach and spend all day looking at a myriad of numbers but there is also a cheat. It is certainly sub-optimal, but it will tell you a lot about any investment.

Look at a graph of cumulative PL. It should start low at the left and be high at the right. And do the wiggles look tolerable? Statisticians hate this approach, deriding it as “chi by eye”. But it is very effective. In fact, I would bet that for 95% of prospective investments, deciding by running through every metric known will lead you to the same conclusion as the eye test.

Run the numbers, but start by just looking. The pilot has a lot of gauges, but he still looks out the window.