Aug 8, 2007: On the complexity of the Mean Variance Optimization (MVO) problem & its practical solutions
Contents:
Complexity of Mean Variance Optimization (MVO) Using a fast greedy algorithm to quickly solve MVO Using Simulated Annealing to better solve MVO More algorithm options On the complexity of MVO
Consider the task of picking, say, 14 ETFs out of the top 40 ETFs. This task which sounds so innocent turns out to be a hard computational problem since as we know from our combinatorics classes, the number of different ways to pick K items out of N items can be very large:
Where K=14 out of N=40 gives over 23 billion possibilities:
C(40,14) = (40!/(14!*(40-14)!)) = 23,206,929,840Our portfolio composition problem is even harder because in order to be able to tell which of any two portfolios is better, we need to compute the overall portfolio mean correlation for each and every portfolio we compare. More significantly: the optimal solution may not even be in the "14 out of 40" space, it may be a result of picking 10 out of the top 30 candidates, or whatever.
The solution lies in a compromise: we want to find a solution that is good enough and likely close enough to the optimal solution without having to go over all possible solutions.
Using a fast greedy algorithm to quickly solve MVO
One of the algorithms I implemented to find a near optimal solution in a large space really fast is a Greedy algorithm (wikipedia link) implemented this way:
- Pick the best (top ranking) candidate for the portfolio
- As long as we have less than N components:
- Pick the component that is the least correlated with all the portfolio (assembled so far) combined.
- Add it to the portfolio
This algorithm is very fast for portfolio sizes of 6-20 out of 20-50 or so candidates. It takes less than 2 seconds to find a near optimal portfolio of 14 components out of 45 on my dual-core Opteron(tm) desktop using an interpreted language (perl). It is simple and straight forward, and it is easy to run it serially over NxM combinations of N and M (where N < M) and pick the best solution among all runs.
Using Simulated Annealing to better solve MVO
An even better algorithm, which takes longer, but gets closer to an optimal solution in most cases is Simulated Annealing (wikipedia link). Using the simulated annealing algorithm with some little clever adjustments, I'm able to converge on a pretty good solution (usually better than the greedy algorithm version) with a low mean correlation in about 5-10 seconds.The idea of simulated annealing is to look for a global maximum in a terrain, by starting with some arbitrary solution, then wandering around trying to find better neighboring solutions. To pick a "neighboring" solution we apply one small change to the portfolio: e.g. add, remove, or replace one ETF. To decide if it is better we evaluate the new solution in terms of fitness (goodness). In our case better means a lower mean-correlation between components.
We always keep a record of the best solution found so far.
If a new solution is better, we switch to it and continue the loop looking for the next neighbor.
If the neighbor portfolio is not better, we randomly, with some (low) probability switch to the neighbor anyway.
Finally, we gradually lower the probability of switching to a random neighbor the closer we get to the end.
The occasional random jump is designed to avoid getting stuck in a local maximum, i.e. to keep searching for the global maximum.
The ending condition can be either of
- We've found a solution which we estimate to be "good enough" or
- A predetermined amount of time has passed or
- A predetermined number of iterations have been completed
There's more to it. For example:
and a few more small details each of which has actually made a difference in the speed of converging on a good solution fast, in my implementation.
- How to cleverly pick a neighbor which increases the chances of finding a better solution, e.g. if we just pick new candidate ETF at random, the chances for improvement aren't high but if we bias the new pick towards less correlated ETFs in general, we can speed the progress dramatically.
- How fast to lower the chance of a random transition (the annealing schedule)
- How to randomize the transitions (what probability to assign to the random jumps)
- Whether to do periodic restarts. i.e. revert back to the best solution found so far, whenever we wander around with no progress for a while
Acknowledgement: I would like to thank Dror Sneh and Gideon Intrater for suggesting better algorithms and Simulated Annealing to me during the last 4th of July weekend, when I described the problem to them. (It sounds so obvious in hindsight.)
Note: there are several machine-learning optimization techniques that should converge much faster (e.g. see: Gradient Descent, Quasi-Newton methods, SR1 method, BFGS ) but they require a differentiable function (being able to know the gradient at each solution) so they aren't applicable in this case where there is a very large number of local maxima, and the target function is not continuous/differentiable.
More algorithm options
To make my implementation more flexible I added two options to both algorithms:
- An option (-r C) to forbid any pair of ETFs to have a pair-wise correlation greater than C. This produces smaller portfolios since after some additions often it is impossible to find any one candidate that doesn't correlate with any of the already selected members. It also greatly speeds-up the solution because of the correlation pruning, and figures out which ETFs are very similar to each other so it doesn't make much sense to include more than one of them.
- An option (-s) to optimize for Sharpe-ratio of the whole portfolio instead of minimum mean correlation. This essentially selects the best portfolio on a risk-adjusted basis on the Markowitz Efficient frontier (wikipedia link). It is a slightly riskier alternative to minimum correlation because past returns may be less forward looking (have a tendency to repeat) than just correlations.
Here's an example of a Simulated Annealing run trying to pick the best Sharpe-ratio sub-portfolio out of 45 candidates, in 20011 iterations. A precondition (-r 0.85) requires than no two ETFs in the portfolio will have a higher correlation than 0.85.
In this example run, the algorithm progressively finds better and better solutions (17 times). The best solution in a row is found in iteration 4003 of 20011.
The result is a strong 7-ETF portfolio with nice weekly Sharpe-ratio of 0.22 and a low mean internal correlation of 0.72. In back-testing it shows a return of 8.06% in the past 18 weeks (a bit over 4 months), including the recent (July-Aug 2007) correction. The assumption is that it is likely to continue the strong performance with low risk, in the coming months:
$ etf-cor -alps -R 5 -r 0.85 14 18w 0w 1w DIA PRFN JKF JKI TTH JKD DFE DKA VOX JKG DLS EWK RPV ISI EWG DEB PRFE DEW JKJ VDE FEZ EFV DBN UTH VGK ADRD RFV EZU DTH DWM IXP VTV ADRU JKH IEV PID PXE EFA IXC DOO MXI JKL EWY VO EWQ Annealing: 1/20011 [best #1 found @ 1] 0.1655 Annealing: 3/20011 [best #2 found @ 3] 0.1850 Annealing: 4/20011 [best #3 found @ 4] 0.1856 Annealing: 5/20011 [best #4 found @ 5] 0.2009 Annealing: 7/20011 [best #5 found @ 7] 0.2535 Annealing: 8/20011 [best #6 found @ 8] 0.2874 Annealing: 10/20011 [best #7 found @ 10] 0.2967 Annealing: 11/20011 [best #8 found @ 11] 0.3046 Annealing: 12/20011 [best #9 found @ 12] 0.3287 Annealing: 18/20011 [best #10 found @ 18] 0.3403 Annealing: 20/20011 [best #11 found @ 20] 0.3467 Annealing: 21/20011 [best #12 found @ 21] 0.3526 Annealing: 28/20011 [best #13 found @ 28] 0.3723 Annealing: 48/20011 [best #14 found @ 48] 0.3838 Annealing: 20011/20011 [best #17 found @ 4003] 0.4217 rrrr [TTH UTH EWY VOX PRFN VDE DIA] Portfolio[7]: TTH EWY VDE UTH VOX PRFN DIA %Change[18]: 1.45 0.25 1.65 1.56 0.56 1.69 1.04 0.86 1.23 -2.57 2.84 %0.03 -1.07 3.01 2.21 -0.53 -4.94 -1.16 Correlation matrix [18w..0w/1w]: TTH EWY VDE UTH VOX PRFN DIA TTH - .41 .40 .67 .80 .59 .68 EWY .41 - .83 .57 .67 .73 .68 VDE .40 .83 - .70 .76 .85 .81 UTH .67 .57 .70 - .78 .82 .92 VOX .80 .67 .76 .78 - .85 .85 PRFN .59 .73 .85 .82 .85 - .94 DIA .68 .68 .81 .92 .85 .94 - Mean correlation: 0.729536 [18w-0w/1w 7] Portfolio SPY Portfolio-vs-SPY (> 1.0 is better) ------------- --------- ------ ---------------- %Total_Return: 8.06 1.12 7.17 %Annualized: 24.24 3.18 7.62 %Return_Mean: 0.43 0.06 6.94 %Return_StdDev: 1.95 1.87 0.96 %Max_DrawDown: -4.94 -5.47 1.11 %Alpha(annual): 20.44 0.00 - Beta: 1.02 1.00 0.99 Alpha/Beta: 20.13 0.00 - R(correlation): 0.97 1.00 1.03 %R2: 94.49 100.00 1.06 Sharpe_ratio: 0.2217 0.0332 6.68 Sortino_ratio: 0.4248 0.0612 6.94 Current (2007-08-03) MMVR rankings: 1 2.8472 EWY iShares MSCI South Korea Index 2 2.3886 TTH Telecom HOLDRs 15 1.5022 VOX Vanguard Telecom Services VIPERs 43 1.0927 VDE Vanguard Energy VIPERs 50 1.0524 DIA DIAMONDS Trust, Series 1 102 0.4591 UTH Utilities HOLDRs 120 0.3420 PRFN PowerShares FTSE RAFI Industrials-- ariel