Am 17.09.19 um 09:47 schrieb Riccardo (Jack) Lucchetti:
On Tue, 17 Sep 2019, Ioannis A. Venetis wrote:
> I agree Artur. I was recently in need for a resample without replacement
> function.
>
> I have not check your code but I had a look on what I did and it could
> be a refinement (following also Sven's answer).
>
> Say a matrix A is nxm.
>
> set k<=m.
>
> rvec = ranking( mrandgen(u,0,1,m,1) )'
>
> A=[,rvec[1:k]]
>
> (keeps k randomly selectled columns of A)
>
> I would like to see such a function in extra.gfn
>
I like it. We could have something like this in extra.gfn:
<hansl>
function matrix sample_wo_rep(matrix X, scalar k)
scalar n = rows(X)
if k > n
funcerr "what?"
endif
matrix sel = ranking(muniform(n,1))
return X[sel[1:k],]
end function
### example
X = mnormal(10, 3)
print X
eval sample_wo_rep(X, 6)
</hansl>
Interesting to see that this hansl-based version almost as fast as
compared to the native msample() function:
<hansl>
clear
set verbose off
function matrix sample_wo_rep (const matrix X,
const scalar k)
scalar n = rows(X)
if k > n
funcerr "what?"
endif
matrix sel = ranking(muniform(n,1))
return X[sel[1:k],]
end function
# Parameters
scalar R = 100
scalar C = 100
scalar SIZE = 5
if SIZE>R
stop
endif
scalar RUNS = 10000
matrix X = mnormal(R, C)
matrix runtime = zeros(RUNS, 2)
loop algo = 1..2 -q
loop i=1..RUNS -q
set stopwatch
if algo
matrix res = sample_wo_rep(X, SIZE)
else
matrix res = msample(X, SIZE)
endif
runtime[i, algo] = $stopwatch
endloop
endloop
printf "Mean runtime\n%12.9f\n", meanc(runtime)
printf "Ratio of Jack-to-msample() = %.3f\n",
meanc(runtime[,1])/meanc(runtime[,2])
</hansl>
Artur