Rolling OLS
by Filipe Rodrigues da Costa
Hi All,
I have been trying to solve a few problems regarding the rolling
regression. In order to separate them, I am using a simple example here,
in which I have two financial assets: Amazon (AMZN) and the market
(SP500). I'm trying to estimate "beta", which can be roughly obtained
from regressing the returns of AMZN on the returns of SP500.
I found a very simple code from a previous issue on this list, which is
quite straightforward:
> T = $nobs
>
> scalar window_size = 20
> scalar k = $nobs - window_size + 1
> series b = NA
>
> smpl 1 window_size
> loop i = window_size .. T
> ols AMZN const SP500
> if i < T
> smpl +1 +1
> endif
> endloop
>
> smpl full
So far so good, this works quite well. But let's say the data covers 100
periods for SP500 but only 60 for AMZN (no data for the last 40).
Because I'm using a rolling window of 20 data points, a point will come
when the routine will use only 19 data points, 18, 17, and so on until
reaching 2 (which is the technical minimum). The reason for this is
because the routine will still identify datapoints in the full sample,
even though there are less for AMZN.
My question is as follows: Is there a simple way of imposing the routine
to only estimate OLS when we have the full 20 data points for AMZN and
20 for SP500? When I have a very large dataset with 400 or even 500
assets, there are many cases where some just went out of the market and
then I should not be estimating betas. I believe the program checks for
the $t1 and if it exists it computes OLS. I would like it to check for
everything between $t1 and $t2 and if some missing, in particular at the
end, just don't compute.
Hope I could make myself clear! Thanks all!
--
Filipe Rodrigues da Costa
Send me an email to: filipe(a)pobox.io
Reach me through Telegram at: https://t.me/rodriguesdacosta