I've now implemented an algorithm for removing then replacing
scale factors in a set of VECM restrictions. (Thanks, Sven, for
the pointers.) Testing so far shows a big improvement in
performance, and the problems of initialization mentioned in my
document from August seem to be solved.
I'll set out the algorithm in detail so people can spot any
problems and suggest improvements.
First, we scan the matrices R and q (as in R*vec(beta) = q),
looking for rows that satisfy this criterion:
* There's exactly one non-zero entry, r_{ij}, in the given row of
R, and the corresponding q_i is non-zero.
For each such row we record the coordinates i and j and the
implied value of the associated beta element, v_j = q_i/r_{ij}.
Based on j we determine the associated column of the beta matrix:
k = j / p_1, where p_1 is the number of rows in beta (and '/'
indicates truncated integer division).
We then ask: Is row i the first such row pertaining to column k of
beta? If so, we flag this row (which I'll call a scaling row) for
removal. If not, we flag it for replacement with a homogeneous
restriction. For example, suppose we find two rows of R that
correspond to
b[2,1] = 1
b[2,3] = -1
We'll remove the first row, and replace the second with
b[2,1] + b[2,3] = 0
In general, the replacement R row is all zeros apart from a 1 in
column j0 and the value x in column j1, where
* j0 is the non-zero column in the scaling row pertaining to
column k of beta;
* j1 is the non-zero column in the row of R to be replaced; and
* x = -v_{j0} / v_{j1}
We then do the maximization using the switching algorithm. Once
that's done, but before computing the variance of the estimates,
we re-scale beta and alpha using the saved information from the
scaling rows.
For each column, k, of beta, if there's an associated scale factor
v_j (as defined above; strictly speaking, v_{j0}):
* Construct the value s_k = b_j / v_j, where b_j is the jth
element of the estimated vec(beta).
* Divide each non-zero element of the beta column by s_k; multiply
each non-zero element of column k of alpha by s_k.
We then recompute the relevant blocks of the observed information
matrix using the rescaled beta and alpha, and I think the variance
comes out right.
This is in CVS and the current Windows snapshot.
Note that the scaling business is _not_ done in case (a) the user
supplies a set of initial values ("set initvals") or (b) the user
specifies using LBFGS rather than switching.
Allin.