On Thu, 29 Apr 2021, Riccardo (Jack) Lucchetti wrote:
Folks,
from time to time I need to check if a certain object belongs or not to a set
of integers. For example, suppose you have a panel and you want to select
only certain units, whose id you list in a vector.
This can of course be generalised, so I wrote a little hansl function (I
called it any() because of my lack of imagination: alternative proposals are
very welcome):
<hansl>
set verbose off
set seed 999
function numeric any(numeric X, const matrix A)
# checks if X belongs to the set A
if typeof(X) == 1
# scalar
scalar ret = max(A .= X)
elif typeof(X) == 2
# series
series valid = ok(X)
series ret = NA
smpl valid --dummy
matrix tmp = { X } .= vec(A)'
ret = sumr(tmp) .> 0
smpl full
elif typeof(X) == 3
# matrix
scalar r = rows(X)
scalar c = cols(X)
matrix ret = maxr(vec(X) .= vec(A)')
ret = mshape(ret, r, c)
endif
return ret
end function
###
### usage example
###
nulldata 20
# scalar
foo = {1, 3, 11}
loop i = 1 .. 6
eval any(i, foo)
endloop
# matrix
matrix Z = mrandgen(i, 1, 6, 6, 3)
print Z
eval any(Z, foo)
# series
series x = randgen(i, 1, 6)
x[5] = NA
series y = any(x, foo)
print x y --byobs
</hansl>
If this is useful, we could either (a) add this to the extra package or (b)
make this a native libgretl function. Comments?
Good idea; I think this could be useful in a number of contexts.
I'm reluctant to mess with git master when we're about to release,
but in a local branch I've tried adding a libgretl function to this
effect (diff against master attached).
Two small comments:
(1) I've named my built-in function inset() rather than any().
(2) I've switched the order of the arguments: it seems to me a
little more intuitive to call
inset(S, whatever)
where S is the "set" and "whatever" is the object to be assessed in
term of membership of the set (scalar, series or matrix).
One further point: hansl has no concept of "set" as such; the doc
for such a function would have to make clear that the "S" argument
is a matrix standing in for a set (with numeric elements only).
Ideally, therefore, S would be a vector with no repeated elements.
But my inset(), like your any(), is permissive on that point. An
arbitrary matrix will be accepted: the row/column structure is
ignored and repeated elements are tolerated but will just slow
things down.
Allin