On Tue, 15 Apr 2014, GOO Creations wrote:
Hi,
I'm benchmarking the Mahalanobis distance to see how the accuracy and
execution time changes with an increasing sample size. As far as I
understand the algorithm the execution time should grow linearly as the
sample size increases. The weird thing is that the time grows linearly up
to (and including) 199 samples, but then suddenly has a drop at 200
samples. I've attached a graph to illustrate this.
I'm using it to do outlier detection. The time drops at 200 samples, but
the accuracy increases without a sudden drop.
Weird. I'm not seeing this here, and I have no clue as to why what you're
seeing should happen.
Are you using a C program + libgretl or one of the gretl clients?
Could you try to run this on your system (via the CLI or the GUI) and see
what happens?
<hansl>
set echo off
set messages off
nulldata 300
setobs 1 1 --special-time-series
set seed 15042014
k = 20
list X = null
loop i = 1..k --quiet
series x_$i = normal()
list X += x_$i
endloop
series cputime = NA
loop t = 100 .. $nobs --quiet
smpl 1 t
set stopwatch
loop 100
mahal X --quiet
endloop
cputime[t] = $stopwatch
endloop
gnuplot cputime time --with-lines --output=display
</hansl>
-------------------------------------------------------
Riccardo (Jack) Lucchetti
Dipartimento di Scienze Economiche e Sociali (DiSES)
Università Politecnica delle Marche
(formerly known as Università di Ancona)
r.lucchetti(a)univpm.it
http://www2.econ.univpm.it/servizi/hpp/lucchetti
-------------------------------------------------------