On the complexity of the Mean Variance Optimization (MVO) problem and 2 practical solutions

Our portfolio composition problem is even harder because in order to be able to tell which of any two portfolios is better, we need to compute the overall portfolio mean correlation for each and every portfolio we compare. More significantly: the optimal solution may not even be in the "14 out of 40" space, it may be a result of picking 10 out of the top 30 candidates, or whatever.

The solution lies in a compromise: we want to find a solution that is good enough and likely close enough to the optimal solution without having to go over all possible solutions.

Using a fast greedy algorithm to quickly solve MVO

One of the algorithms I implemented to find a near optimal solution in a large space really fast is a Greedy algorithm (wikipedia link) implemented this way:

This algorithm is very fast for portfolio sizes of 6-20 out of 20-50 or so candidates. It takes less than 2 seconds to find a near optimal portfolio of 14 components out of 45 on my dual-core Opteron(tm) desktop using an interpreted language (perl). It is simple and straight forward, and it is easy to run it serially over NxM combinations of N and M (where N < M) and pick the best solution among all runs.

Using Simulated Annealing to better solve MVO

The idea of simulated annealing is to look for a global maximum in a terrain, by starting with some arbitrary solution, then wandering around trying to find better neighboring solutions. To pick a "neighboring" solution we apply one small change to the portfolio: e.g. add, remove, or replace one ETF. To decide if it is better we evaluate the new solution in terms of fitness (goodness). In our case better means a lower mean-correlation between components.

If a new solution is better, we switch to it and continue the loop looking for the next neighbor.

If the neighbor portfolio is not better, we randomly, with some (low) probability switch to the neighbor anyway.

Finally, we gradually lower the probability of switching to a random neighbor the closer we get to the end.

The occasional random jump is designed to avoid getting stuck in a local maximum, i.e. to keep searching for the global maximum.

Acknowledgement: I would like to thank Dror Sneh and Gideon Intrater for suggesting better algorithms and Simulated Annealing to me during the last 4th of July weekend, when I described the problem to them. (It sounds so obvious in hindsight.)

Note: there are several machine-learning optimization techniques that should converge much faster (e.g. see: Gradient Descent, Quasi-Newton methods, SR1 method, BFGS ) but they require a differentiable function (being able to know the gradient at each solution) so they aren't applicable in this case where there is a very large number of local maxima, and the target function is not continuous/differentiable.

More algorithm options

Here's an example of a Simulated Annealing run trying to pick the best Sharpe-ratio sub-portfolio out of 45 candidates, in 20011 iterations. A precondition (-r 0.85) requires than no two ETFs in the portfolio will have a higher correlation than 0.85.

In this example run, the algorithm progressively finds better and better solutions (17 times). The best solution in a row is found in iteration 4003 of 20011.

The result is a strong 7-ETF portfolio with nice weekly Sharpe-ratio of 0.22 and a low mean internal correlation of 0.72. In back-testing it shows a return of 8.06% in the past 18 weeks (a bit over 4 months), including the recent (July-Aug 2007) correction. The assumption is that it is likely to continue the strong performance with low risk, in the coming months:

$ etf-cor -alps -R 5 -r 0.85 14 18w 0w 1w DIA PRFN JKF JKI TTH JKD DFE DKA VOX JKG DLS EWK RPV ISI EWG DEB PRFE DEW JKJ VDE FEZ EFV DBN UTH VGK ADRD RFV EZU DTH DWM IXP VTV ADRU JKH IEV PID PXE EFA IXC DOO MXI JKL EWY VO EWQ Annealing: 1/20011 [best #1 found @ 1] 0.1655 Annealing: 3/20011 [best #2 found @ 3] 0.1850 Annealing: 4/20011 [best #3 found @ 4] 0.1856 Annealing: 5/20011 [best #4 found @ 5] 0.2009 Annealing: 7/20011 [best #5 found @ 7] 0.2535 Annealing: 8/20011 [best #6 found @ 8] 0.2874 Annealing: 10/20011 [best #7 found @ 10] 0.2967 Annealing: 11/20011 [best #8 found @ 11] 0.3046 Annealing: 12/20011 [best #9 found @ 12] 0.3287 Annealing: 18/20011 [best #10 found @ 18] 0.3403 Annealing: 20/20011 [best #11 found @ 20] 0.3467 Annealing: 21/20011 [best #12 found @ 21] 0.3526 Annealing: 28/20011 [best #13 found @ 28] 0.3723 Annealing: 48/20011 [best #14 found @ 48] 0.3838 Annealing: 20011/20011 [best #17 found @ 4003] 0.4217 rrrr [TTH UTH EWY VOX PRFN VDE DIA] Portfolio[7]: TTH EWY VDE UTH VOX PRFN DIA %Change[18]: 1.45 0.25 1.65 1.56 0.56 1.69 1.04 0.86 1.23 -2.57 2.84 %0.03 -1.07 3.01 2.21 -0.53 -4.94 -1.16 Correlation matrix [18w..0w/1w]: TTH EWY VDE UTH VOX PRFN DIA TTH - .41 .40 .67 .80 .59 .68 EWY .41 - .83 .57 .67 .73 .68 VDE .40 .83 - .70 .76 .85 .81 UTH .67 .57 .70 - .78 .82 .92 VOX .80 .67 .76 .78 - .85 .85 PRFN .59 .73 .85 .82 .85 - .94 DIA .68 .68 .81 .92 .85 .94 - Mean correlation: 0.729536 [18w-0w/1w 7] Portfolio SPY Portfolio-vs-SPY (> 1.0 is better) ------------- --------- ------ ---------------- %Total_Return: 8.06 1.12 7.17 %Annualized: 24.24 3.18 7.62 %Return_Mean: 0.43 0.06 6.94 %Return_StdDev: 1.95 1.87 0.96 %Max_DrawDown: -4.94 -5.47 1.11 %Alpha(annual): 20.44 0.00 - Beta: 1.02 1.00 0.99 Alpha/Beta: 20.13 0.00 - R(correlation): 0.97 1.00 1.03 %R2: 94.49 100.00 1.06 Sharpe_ratio: 0.2217 0.0332 6.68 Sortino_ratio: 0.4248 0.0612 6.94 Current (2007-08-03) MMVR rankings: 1 2.8472 EWY iShares MSCI South Korea Index 2 2.3886 TTH Telecom HOLDRs 15 1.5022 VOX Vanguard Telecom Services VIPERs 43 1.0927 VDE Vanguard Energy VIPERs 50 1.0524 DIA DIAMONDS Trust, Series 1 102 0.4591 UTH Utilities HOLDRs 120 0.3420 PRFN PowerShares FTSE RAFI Industrials

Aug 8, 2007: On the complexity of the Mean Variance Optimization (MVO) problem & its practical solutions

On the complexity of MVO

Using a fast greedy algorithm to quickly solve MVO

Using Simulated Annealing to better solve MVO

More algorithm options