Fastest way of computing standard errors / inverting X'X

Asked by mickey on 3 Jan 2012
Latest activity Commented on by mickey on 4 Jan 2012

Hello all,

I'm running a Monte Carlo study and need to evaluate many linear regressions y = X b + u very efficiently. Specifically, I need to estimate a regression and compute the standard errors of the estimates many times. So speed is very important, while accuracy is not too much so. Evidently, Matlab's built-in functions such as lscov and regress take a fair amount of time to run. Hence, I wrote a little function to do these tasks myself.

However, as I need to calculate the standard errors, the function is still quite slow. I use inv(X' * X) to get the inverse, as it is used in the calculation of the standard errors (and it is evidently faster than X' * X \ eye(size(X, 2))). Is there a faster way of doing this, i.e. by some smart factorizations? I saw a suggestion to use a QR decomposition, and then use inv( R ) for calculating inv(X'* X) but this is even slower.

All the help is greatly appreciated!

Products

No products are associated with this question.

Answer by Teja Muppirala on 4 Jan 2012

You can use symbolic math to find the inverse of a symmetric 3x3 matrix:

```syms a b c d e f real
M = diag(inv([a b c; b d e; c e f]))
```

And then use that expression directly. I find that this is several times faster than calling INV.

```X = rand(100,3);
XX = X'*X;
denom = (XX(9)*XX(2)^2 - 2*XX(2)*XX(3)*XX(6) + XX(5)*XX(3)^2 + XX(1)*XX(6)^2 - XX(1)*XX(5)*XX(9));
```
```invXX = [(XX(6)^2 - XX(5)*XX(9))/denom;
(XX(3)^2 - XX(1)*XX(9))/denom;
(XX(2)^2 - XX(1)*XX(5))/denom]
```

You can confirm that this is the same as diag(inv(X'*X)).

1 Comment

mickey on 4 Jan 2012

Hey! Thanks so much for the answer. This indeed is faster, although not by several times, at least on my machine. It may be interesting to note that for finding the estimates it still seems better to calculate X \ y, instead of using the whole matrix (XX)^{-1} calculated as above, if I did not do any mistakes.

Answer by Matt Tearle on 3 Jan 2012

inv is evil -- avoid! The quickest way to do a regression is

```c = X\y;
```

then you can calculate errors yourself

```resid = X*c - y;
% etc
```

but are you doing this numerous times for different y but the same X? If so, maybe an LU decomposition would help.

mickey on 3 Jan 2012

Forgot to add that iK = size(mX, 2). :)

Matt Tearle on 3 Jan 2012

When I test it, I get the same result for rinv*rinv' as inv(x'*x), to within roundoff. (Which is what should happen, from theory.) I thought extracting the non-zero rows of r might help dispose of a few redundant calculations. But if inv() is the bottleneck... Maybe bruteforce inverse calculation would help? This seems to be a tiny bit faster than inv:

[~,r] = qr(x);
rinv = diag(1./diag(r));
rinv(2,3) = -r(2,3)*rinv(3,3)/r(2,2);
rinv(1,2) = -r(1,2)*rinv(2,2)/r(1,1);
rinv(1,3) = (-r(1,2)*rinv(2,3)-r(1,3)*rinv(3,3))/r(1,1);
vcov = rinv*rinv';

But, after all that, I must have been doing something wrong before, because I'm now getting that inv(x'*x) is faster after all.

mickey on 4 Jan 2012

Hey! Thanks again for this answer. At the end I opted for Teja's way of calculation, as it seems faster for the smaller problem I have. For bigger problems, though, I think something like that should be preferred. :) Thanks!!