VECM restrictions and scaling

Ang. Re: [Gretl-devel] Cumbersome...

Test calculator bug [Was: Re:...

Allin Cottrell

Wednesday, 12 September 2007 Wed, 12 Sep '07

3:25 p.m.

I've now implemented an algorithm for removing then replacing scale factors in a set of VECM restrictions. (Thanks, Sven, for the pointers.) Testing so far shows a big improvement in performance, and the problems of initialization mentioned in my document from August seem to be solved. I'll set out the algorithm in detail so people can spot any problems and suggest improvements. First, we scan the matrices R and q (as in R*vec(beta) = q), looking for rows that satisfy this criterion: * There's exactly one non-zero entry, r_{ij}, in the given row of R, and the corresponding q_i is non-zero. For each such row we record the coordinates i and j and the implied value of the associated beta element, v_j = q_i/r_{ij}. Based on j we determine the associated column of the beta matrix: k = j / p_1, where p_1 is the number of rows in beta (and '/' indicates truncated integer division). We then ask: Is row i the first such row pertaining to column k of beta? If so, we flag this row (which I'll call a scaling row) for removal. If not, we flag it for replacement with a homogeneous restriction. For example, suppose we find two rows of R that correspond to b[2,1] = 1 b[2,3] = -1 We'll remove the first row, and replace the second with b[2,1] + b[2,3] = 0 In general, the replacement R row is all zeros apart from a 1 in column j0 and the value x in column j1, where * j0 is the non-zero column in the scaling row pertaining to column k of beta; * j1 is the non-zero column in the row of R to be replaced; and * x = -v_{j0} / v_{j1} We then do the maximization using the switching algorithm. Once that's done, but before computing the variance of the estimates, we re-scale beta and alpha using the saved information from the scaling rows. For each column, k, of beta, if there's an associated scale factor v_j (as defined above; strictly speaking, v_{j0}): * Construct the value s_k = b_j / v_j, where b_j is the jth element of the estimated vec(beta). * Divide each non-zero element of the beta column by s_k; multiply each non-zero element of column k of alpha by s_k. We then recompute the relevant blocks of the observed information matrix using the rescaled beta and alpha, and I think the variance comes out right. This is in CVS and the current Windows snapshot. Note that the scaling business is _not_ done in case (a) the user supplies a set of initial values ("set initvals") or (b) the user specifies using LBFGS rather than switching. Allin.

Show replies by date

Sven Schreiber

Thursday, 13 September Thu, 13 Sep

10:15 a.m.

Allin Cottrell schrieb:

...

That's great news! I'm really happy about the speed at which this new feature takes shape, thanks very much.

...

I'll set out the algorithm in detail so people can spot any problems and suggest improvements.

[...] Well I'd say if it works in a number of test cases, it can't be all that wrong. I will do some testing with the latest snapshot in the next couple of days as well. I assume that it must be done via a script and the syntax is like in the previous examples, right? Also still w/o restricted exogenous variables? (just checking, not pushing)

...

We then recompute the relevant blocks of the observed information matrix using the rescaled beta and alpha, and I think the variance comes out right.

I will test that as well (i.e. compare results with PcGive). Is the variance denominator T-c now or do I have to take into account that difference? cheers, sven

Allin Cottrell

4:40 p.m.

On Thu, 13 Sep 2007, Sven Schreiber wrote:

...

Allin Cottrell schrieb: > I've now implemented an algorithm for removing then replacing > scale factors in a set of VECM restrictions. (Thanks, Sven, for > the pointers.) Testing so far shows a big improvement in > performance, and the problems of initialization mentioned in my > document from August seem to be solved. > That's great news! I'm really happy about the speed at which this new feature takes shape, thanks very much.

And thanks for the encouragement!

...

> I'll set out the algorithm in detail so people can spot any > problems and suggest improvements. [...] Well I'd say if it works in a number of test cases, it can't be all that wrong. I will do some testing with the latest snapshot in the next couple of days as well. I assume that it must be done via a script and the syntax is like in the previous examples, right? Also still w/o restricted exogenous variables? (just checking, not pushing)

Yes, so far.

...

> We then recompute the relevant blocks of the observed information > matrix using the rescaled beta and alpha, and I think the variance > comes out right. I will test that as well (i.e. compare results with PcGive). Is the variance denominator T-c now or do I have to take into account that difference?

It's still straight T but I hope to work on that soon. Allin.

Allin Cottrell

9:08 p.m.

On Thu, 13 Sep 2007, Allin Cottrell wrote:

...

> I will test that as well (i.e. compare results with PcGive). Is the > variance denominator T-c now or do I have to take into account that > difference? It's still straight T but I hope to work on that soon.

OK, now in CVS and the Windows snapshot it's T-c. Since counting "c" can be a bit fiddly, testing would be nice! Allin

Allin Cottrell

Friday, 14 September Fri, 14 Sep

9:39 p.m.

On Thu, 13 Sep 2007, Sven Schreiber wrote:

...

Also still w/o restricted exogenous variables?

I've now made a start on this. For now it's script-only, and the syntax is vecm order rank yvars ; exog. vars ; restricted exog. vars (As before, one should use option flags to get the basic restricted const or trend.) I've run a simple example OK but I don't have any benchmark to test the results against. The IRF bootstrap will certainly not work right at this point, and it's possible some other things are broken too. I'll have a further go at this before long. CVS and Windows snapshot. Allin.

Allin Cottrell

Saturday, 15 September Sat, 15 Sep

5:07 p.m.

...

On Thu, 13 Sep 2007, Sven Schreiber wrote: > Also still w/o restricted exogenous variables?

I've now committed and snapshotted a somewhat less ragged version of the code for handling restricted exogenous variables. I've also tested against the Boswijck and Doornik Neerlandica example. They scale the restricted trend variable to get the coefficients on a nicer scale. Previously I was just using gretl's "--crt" flag, but now I've tried removing --crt and appending trend = t/100 as a restricted exogenous variable. This example works nicely. Allin.

Sven Schreiber

Sunday, 16 September Sun, 16 Sep

7:59 p.m.

Hi, thanks very much for the new snapshots with T-c and restricted exogenous variables. This makes it possible to compare directly to PcGive's results and thus testing is quite easy. I've done some testing with a real-world case; it's a 6-dim Vecm with a restricted step dummy in the equilibrium (cointegration) relations. The unrestricted estimates (cointegration rank r=4) are exactly identical, including the standard errors. The only difference is that gretl prints "Determinante der Kovarianzmatrix = 1.614282e-015" (determinant of the covariance matrix), whereas pcgive reports "-T/2log|Omega| 1873.29374". Now I guess these numbers are not supposed to mean the same thing, but nevertheless a result of 1e-15 strikes me as strange. Maybe it's actually the last improvement of the convergence algorithm or something like that? --------------------------- Next, with some overidentifying restrictions on beta (both exclusion and more general restrictions, partly also applied to the restricted exogenous variable, and apart from normalization), gretl's results are not so good (I also append pcgive's results for comparison): <gretl> Rank of Jacobian = 34, number of free parameters = 34 Model is fully identified Based on Jacobian, df = 2 Switching algorithm: 29552 iterations -(T/2)log|Omega| = 1873.9556, lldiff = 3.99749e-011 Unrestricted loglikelihood (lu) = 932.72411 Restricted loglikelihood (lr) = 928.94253 2 * (lu - lr) = 7.56317 P(Chi-Square(2) > 7.56317 = 0.0227865 </gretl> <pcgive> log-likelihood 932.00515 -T/2log|Omega| 1877.01821 no. of observations 111 no. of parameters 100 rank of long-run matrix 4 no. long-run restrictions 2 beta is identified LR test of restrictions: Chi^2(2) = 1.4379 [0.4873] Switching (scaled linear) using analytical derivatives (eps1=0.0001; eps2=0.005): Weak convergence </pcgive> Adding some restrictions on alpha, I sometimes manage to hit gretl's bound of 50000 iterations :-) -------------------- So I tried out something more modest: removing the restricted exogenous variable and imposing some just-identifying but somewhat unusual restrictions. Then PcGive reports: <pcgive> log-likelihood 910.764932 -T/2log|Omega| 1855.778 no. of observations 111 no. of parameters 98 rank of long-run matrix 4 no. long-run restrictions 0 beta is identified No restrictions imposed Switching (scaled linear) using analytical derivatives (eps1=0.0001; eps2=0.005): Strong convergence </pcgive> But gretl's output is: <gretl> Rank of Jacobian = 32, number of free parameters = 32 Model is fully identified Based on Jacobian, df = 0 Switching algorithm: 2319 iterations -(T/2)log|Omega| = 1855.6699, lldiff = 3.99838e-011 Unrestricted loglikelihood (lu) = 910.76493 Restricted loglikelihood (lr) = 910.65685 </gretl> So gretl seems to have some problems with "rotations" of the cointegration space, which in theory (and confirmed by pcgive) leave the likelihood unaffected. Let me know if some other details or setups would help you. I'll also try to think about what other approaches could help. Cheers, sven

Allin Cottrell

9:38 p.m.

On Sun, 16 Sep 2007, Sven Schreiber wrote:

...

thanks very much for the new snapshots with T-c and restricted exogenous variables. This makes it possible to compare directly to PcGive's results and thus testing is quite easy.

Good so far!

...

I've done some testing with a real-world case; it's a 6-dim Vecm with a restricted step dummy in the equilibrium (cointegration) relations. The unrestricted estimates (cointegration rank r=4) are exactly identical, including the standard errors. The only difference is that gretl prints "Determinante der Kovarianzmatrix = 1.614282e-015" (determinant of the covariance matrix), whereas pcgive reports "-T/2log|Omega| 1873.29374". Now I guess these numbers are not supposed to mean the same thing, but nevertheless a result of 1e-15 strikes me as strange...

Gretl is reporting |Omega|, and the reported values are consistent: -(T/2)*log(1.614282e-15) is close to what pcgive gives, the difference presumably being attributable to the fact that |Omega| is given by gretl to only 7 digits.

...

Thanks very much for doing the testing. To make progress it would be very helpful to have a copy of the data file and script you're using with gretl. Could you maybe send me that offlist? (Maybe Jack would be interested too?) Allin.

Allin Cottrell

9:58 p.m.

A few more thoughts. On Sun, 16 Sep 2007, Sven Schreiber wrote:

...

Next, with some overidentifying restrictions on beta (both exclusion and more general restrictions, partly also applied to the restricted exogenous variable, and apart from normalization), gretl's results are not so good (I also append pcgive's results for comparison): <gretl> Rank of Jacobian = 34, number of free parameters = 34 Model is fully identified Based on Jacobian, df = 2 Switching algorithm: 29552 iterations ...

When the switching algorithm takes thousands of iterations, it's a pretty safe bet that the results will not be good. I haven't seen that happen lately (since adding the scale-removal code), but clearly you've got a case here that is not handled correctly in gretl.

...

So I tried out something more modest: removing the restricted exogenous variable and imposing some just-identifying but somewhat unusual restrictions...

...

<gretl> Rank of Jacobian = 32, number of free parameters = 32 Model is fully identified Based on Jacobian, df = 0 Switching algorithm: 2319 iterations -(T/2)log|Omega| = 1855.6699, lldiff = 3.99838e-011 Unrestricted loglikelihood (lu) = 910.76493 Restricted loglikelihood (lr) = 910.65685 </gretl>

I suppose we need a check: if df = 0, and yet the "restricted" likelihood is less than the unrestricted, then clearly something has gone awry. Again, I haven't seen a result of this sort, but then I haven't yet tried a wide range of test cases. Allin.

Sven Schreiber

Tuesday, 18 September Tue, 18 Sep

11:48 a.m.

Now looking at the algorithm in more detail... Allin Cottrell schrieb:

...

First, we scan the matrices R and q (as in R*vec(beta) = q), looking for rows that satisfy this criterion: * There's exactly one non-zero entry, r_{ij}, in the given row of R, and the corresponding q_i is non-zero.

seems ok; do you also check whether there are restrictions on alpha (and which kind)?

...

For each such row we record the coordinates i and j and the implied value of the associated beta element, v_j = q_i/r_{ij}. Based on j we determine the associated column of the beta matrix: k = j / p_1, where p_1 is the number of rows in beta (and '/' indicates truncated integer division).

Is this really what you mean? If we're talking about the first element in the first cointegration vector, then with "human" (=one-based) matrix indexing we have i=j=1 and, for example, p_1 = 3. Then k = 1 '/' 3 = 0, but should be 1?

...

We then ask: Is row i the first such row pertaining to column k of beta? If so, we flag this row (which I'll call a scaling row) for removal. If not, we flag it for replacement with a homogeneous restriction. For example, suppose we find two rows of R that correspond to b[2,1] = 1 b[2,3] = -1 We'll remove the first row, and replace the second with b[2,1] + b[2,3] = 0

ok; what do you do if there are more than two relevant rows?

...

In general, the replacement R row is all zeros apart from a 1 in column j0 and the value x in column j1, where * j0 is the non-zero column in the scaling row pertaining to column k of beta; * j1 is the non-zero column in the row of R to be replaced; and * x = -v_{j0} / v_{j1}

looks ok, too

...

We then do the maximization using the switching algorithm. Once that's done, but before computing the variance of the estimates, we re-scale beta and alpha using the saved information from the scaling rows.

Could you store the maximized likelihood value before the rescaling? Just as a debug check if the rescaling leaves it unchanged.

...

For each column, k, of beta, if there's an associated scale factor v_j (as defined above; strictly speaking, v_{j0}): * Construct the value s_k = b_j / v_j, where b_j is the jth element of the estimated vec(beta). * Divide each non-zero element of the beta column by s_k; multiply each non-zero element of column k of alpha by s_k.

...

We then recompute the relevant blocks of the observed information matrix using the rescaled beta and alpha, and I think the variance comes out right.

yes the problem seems to lie with the point estimates themselves well sorry, couldn't find any obvious mistakes :-( -sven

Allin Cottrell

3:39 p.m.

On Tue, 18 Sep 2007, Sven Schreiber wrote:

...

Now looking at the algorithm in more detail...

Thanks for doing this!

...

Allin Cottrell schrieb: > > First, we scan the matrices R and q (as in R*vec(beta) = q), > looking for rows that satisfy this criterion: > > * There's exactly one non-zero entry, r_{ij}, in the given row of > R, and the corresponding q_i is non-zero. seems ok; do you also check whether there are restrictions on alpha (and which kind)?

Yes. I have added some checks since I first described the algorithm, including: we don't try scale removal if we find any alpha restrictions that cross between columns of alpha. (We don't support non-homogeneous restrictions on alpha, so that's not an issue at present.) In addition, we consider scale removal infeasible for a given beta column if (a) we find a non-homogeneous restriction involving two or more beta elements (e.g. b[1,1] - b[1,3] = 2), or (b) we find any cross-column beta restrictions.

...

> For each such row we record the coordinates i and j and the > implied value of the associated beta element, v_j = q_i/r_{ij}. > Based on j we determine the associated column of the beta matrix: > k = j / p_1, where p_1 is the number of rows in beta (and '/' > indicates truncated integer division). Is this really what you mean? If we're talking about the first element in the first cointegration vector, then with "human" (=one-based) matrix indexing we have i=j=1 and, for example, p_1 = 3. Then k = 1 '/' 3 = 0, but should be 1?

Internally, we use 0-based indexing throughout. So in the example you give, k = 0 / 3 = 0, which is correct.

...

> We then ask: Is row i the first such row pertaining to column k of > beta? If so, we flag this row (which I'll call a scaling row) for > removal. If not, we flag it for replacement with a homogeneous > restriction. For example, suppose we find two rows of R that > correspond to > > b[2,1] = 1 > b[2,3] = -1 > > We'll remove the first row, and replace the second with > > b[2,1] + b[2,3] = 0 ok; what do you do if there are more than two relevant rows?

Hmm. A third such row would be treated in the same way as the second -- that is, converted into a homogeneous restriction that references the coefficient in the first scaling row.

...

> We then do the maximization using the switching algorithm. Once > that's done, but before computing the variance of the estimates, > we re-scale beta and alpha using the saved information from the > scaling rows. Could you store the maximized likelihood value before the rescaling? Just as a debug check if the rescaling leaves it unchanged.

Good idea.

...

> We then recompute the relevant blocks of the observed information > matrix using the rescaled beta and alpha, and I think the variance > comes out right. yes the problem seems to lie with the point estimates themselves well sorry, couldn't find any obvious mistakes :-(

I'll try increasing the verbosity of the switching and see if I can spot any strange nasties. Allin.

6538

days inactive

6544

days old

gretl-devel@gretlml.univpm.it

Manage subscription

10 comments

2 participants

tags (0)

participants (2)

Allin Cottrell
Sven Schreiber

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

VECM restrictions and scaling