On Tue, 15 Apr 2014, Allin Cottrell wrote:
On Tue, 15 Apr 2014, GOO Creations wrote:
> I'm benchmarking the Mahalanobis distance to see how the accuracy and
> execution time changes with an increasing sample size. As far as I
> understand the algorithm the execution time should grow linearly as the
> sample size increases. The weird thing is that the time grows linearly up
> to (and including) 199 samples, but then suddenly has a drop at 200
> samples. I've attached a graph to illustrate this.
What implementation of lapack/blas are you using?
The most demanding task in computing Mahalanobis distance is the inversion
of the covariance matrix of the selected series, which is performed via
the lapack Cholesky functions dpotrf and dpotri. Depending on the
implementation, these functions may switch algorithm based on the size of
the input data (e.g. invoking parallelization when a certain threshold
size is exceeded).
That's what I had thought too, initially. However, the size of the
covariance matrix doesn't depend on the number of observations, which is
the variable our friend is tracking (unless I misunderstood his message).
-------------------------------------------------------
Riccardo (Jack) Lucchetti
Dipartimento di Scienze Economiche e Sociali (DiSES)
Università Politecnica delle Marche
(formerly known as Università di Ancona)
r.lucchetti(a)univpm.it
http://www2.econ.univpm.it/servizi/hpp/lucchetti
-------------------------------------------------------