On Sat, 6 Aug 2016, Sven Schreiber wrote:
hi,
in relation to the feature request about cross-section dependence testing in
panels I've encountered some performance bottleneck.
The "smpl" command is rather CPU-expensive. Here's a different solution
(which can be further optimised a little, but I'm sure you get the idea):
<hansl>
function scalar SvensCD(series u)
series unit = $obsmajor
N = max(unit)
#### more elegant attempt (but slow for some reason...)
rhos = zeros(N,N)
Tcomms = zeros(N,N)
loop i=1..N -q
loop j=(i+1)..N -q
smpl (unit == i || unit == j) --restrict --replace
smpl pxnobs(u) == 2 --restrict # both units have values there
matrix uij = { u } # hope it's stacked, so that upper half is unit i
Tcomms[i,j] = rows(uij) / 2
rhos[i,j] = corr( uij[1: Tcomms[i,j]], uij[Tcomms[i,j] + 1 : ])
endloop
endloop
# Pesaran's CD (unbalanced)
scalar CD = sqrt( 2 / (N*(N-1)) ) * sum( sqrt(Tcomms) .* rhos )
return CD
end function
function scalar corr_na(matrix x, matrix y)
e = ok(x) && ok(y)
H = selifr(x ~ y, e)
return corr(H[,1], H[,2])
end function
function scalar JacksCD(series u)
series unit = $obsmajor
N = max($obsmajor)
T = max(time)
set skip_missing off
set warnings off
mu = mshape({u}, T, N)
OK = ok(mu)
matrix Tcomms = OK'OK
matrix rhos = zeros(N,N)
loop i = 1..N --quiet
x = mu[,i]
loop j = i+1..N --quiet
rhos[i,j] = corr_na(x, mu[,j])
endloop
endloop
scalar CD = sqrt( 2 / (N*(N-1)) ) * sum( sqrt(Tcomms) .* rhos )
return CD
end function
# ------------------------------------------------------------------
open abdata.gdt --quiet
set stopwatch
eval SvensCD(n)
t0 = $stopwatch
eval JacksCD(n)
t1 = $stopwatch
print t0 t1
</hansl>
-------------------------------------------------------
Riccardo (Jack) Lucchetti
Dipartimento di Scienze Economiche e Sociali (DiSES)
Università Politecnica delle Marche
(formerly known as Università di Ancona)
r.lucchetti(a)univpm.it
http://www2.econ.univpm.it/servizi/hpp/lucchetti
-------------------------------------------------------