How to add a benchmark to a variance matrix

There is a good way and a bad way to add a benchmark to a variance matrix that will be used for optimization and similar operations.  Our examination sheds a little light on the process of variance matrix estimation in this realm.

Role of benchmarks


Benchmarks are common in investment management.  It’s my opinion that they should not be.  They probably shouldn’t be extinct, but certainly an endangered species.

Their primary use is as a gauge for performance measurement.  That is a process that yields close to zero information.  Much better is to use random portfolios to measure performance.

However, benchmarks are with us, so we need to deal with them.


Benchmarks are used as if you are short the benchmark the same amount as the value of your portfolio.  In terms of weights, you give the benchmark a weight of -1 and the sum of the weights of your long-only portfolio is 1.  (If you have a long-short portfolio, you will need to do some thinking about the benchmark and probably should not rely on what software does by default.)

When doing portfolio optimization and similar tasks, the benchmark needs to be a part of the variance matrix of the assets.  The variance matrix will be created from a matrix of the asset returns.  We look at a good way and a bad way of including a benchmark.

good way

The good way is to build the variance matrix of the assets without the benchmark.  The next step is to add the benchmark as a new asset using the weights of the assets in the benchmark.

This requires that all of the constituents of the benchmark are included in the variance matrix.  Only some trivial arithmetic with the weights is needed to add the benchmark.

The result is a matrix that is no longer positive definite — the benchmark is an exact linear combination of other assets.  (It will be semi-positive definite (assuming the original is positive definite).)  If you are using an optimizer that demands positive-definiteness, then it will adjust the matrix slightly.

bad way

The not-good way — but the one that might be the first one thought of — is to include a history of returns of the benchmark in the asset return matrix and then build the variance matrix.

There are three reasons this is bad:

  1. as prices change through history, the benchmark weights change
  2. constituents of benchmarks change over time
  3. the variance estimation process distorts the asset relationships with the benchmark

The first issue is extremely trivial, and wouldn’t preclude this method.  The second can be quite material.  The third will be important as long as you are using something more sophisticated than (essentially) the sample variance as your variance estimate.

Optimization examples

We can see the effect of doing it the bad way by doing an optimization to minimize tracking error where there are no constraints stopping us from recovering the benchmark exactly.


We use the variance estimation functions from the BurStFin R package.  The package has two such functions:

  • Ledoit-Wolf shrinkage towards equal correlation
  • statistical factor model

Both of these by default ensure there is a margin away from semi-positive definite.


The variances were estimated for 474 large cap US stocks with the daily returns from 2010.

The benchmark used was ever so close to equal weights on all the assets.

Minimum tracking error

The predicted minimum tracking error in basis points using different variance estimates was:

  • Ledoit-Wolf (default): 1438
  • statistical factor model (default): 1078
  • Ledoit-Wolf (zero margin): 700
  • statistical factor model (zero margin): 83
  • sample variance: 1

Clearly the attempts to keep the variances away from the border with semi-positive definite has a serious effect.  But there are big effects even without that.

The Ledoit-Wolf estimator starts with the sample variance and then does Bayesian shrinkage towards equal correlations.  That shrinkage imposes a 7% tracking error in this case.  The factor model does its adjustment by setting the variability in certain directions to zero — that change is much less drastic in this example.

From this viewpoint the distortion seems bad.  But the distortion is a good thing — it is reducing the noise in the variance estimate so that when it is used as a prediction, it will perform better.


Use the constituent weights of a benchmark to add it to a variance.

Appendix R

The computations were (of course) done in R.

create benchmark portfolio

The computations involving optimization use the Portfolio Probe software:


We create the benchmark as a portfolio worth 10 million dollars at the end of 2010 that has almost equal weight for all of the assets:

eqwtPort10 <- trade.optimizer(sp5.price10, gross=1e7, 
   long.only=TRUE, positions=cbind(rep(1e7/475,474), 
   1e7/473), variance=sp5.var10)

The variance in this case is of no importance since the portfolio is essentially defined by the constraints.  However, the optimizer demands something to use as a utility — a variance is an easy choice.

Since the prices are not all equal and the optimizer trades integer amounts, it is not possible to have exactly equal weights.  The positions argument is giving a tight interval for the amount of money allowed in each asset.

good variance estimate

The variance estimators come from the BurStFin package:


The first step is to estimate the variance with just the asset returns:

ewVar10 <- var.shrink.eqcor(diff(log(sp5.close10)))

Next add the benchmark to the variance using its constituent weights:

ewVar10 <- var.add.benchmark(ewVar10, 
   valuation(eqwtPort10)$weight, "EWweight")

This is taking three arguments: a variance matrix, a vector of weights, and a character string to use as the name of the benchmark in the expanded variance matrix that is returned.

variance estimates with benchmark history

First we add the benchmark valuation over the year to the asset price matrix:

ewPrice10 <- cbind(sp5.close10, 
      EWhistory=valuation(eqwtPort10, sp5.close10))

The variance estimates that use the history of benchmark returns are:

# default Ledoit-Wolf:
ewLWVar10 <- var.shrink.eqcor(diff(log(ewPrice10)))

# default statistical factor model:
ewFacVar10 <- factor.model.stat(diff(log(ewPrice10)))

# Ledoit-Wolf with no degeneracy safety
ewLWzVar10 <- var.shrink.eqcor(diff(log(ewPrice10)), 

# statistical factor model with no degeneracy safety
ewFaczVar10 <- factor.model.stat(diff(log(ewPrice10)),

# sample variance
ewSampVar10 <- var(diff(log(ewPrice10)))

optimization for minimum tracking error

The form of the optimizations is:

ewopLWz.hist.full <- trade.optimizer(sp5.price10, 
   variance=ewLWzVar10, gross=1e6, long.only=TRUE, 
   benchmark="EWhistory", utility="minimum variance")

This is creating a portfolio worth one million dollars that is long-only but otherwise unconstrained.

This entry was posted in Quant finance, R language and tagged , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *