But why not Silverman’s rule-of-thumb estimator (1.06*sd(X)*n^-0.2)? The
optimal bandwidth *should* be of *order* n^-0.2, but should also depend on
the spread of X! The exact formula is (R(K)/(\sigma^2_K*R(f'')))^-0.2 *
n^-0.2, where R(f'') depends on the actual density and therefore the
distribution of X’s, and therefore, its spread.
If I have a dataset, say, with length/weight data, it would be very silly
to use the same bandwidth for length in metres and for length in
millimetres because it would over-smooth in the first case and under-smooth
in the second!
For reference, see Scott (2015) “Multivariate estimation”, 2e, p. 144,
“Normal reference rule”.
R uses this rule by default (bw.nrd). In fact, it guards against outliers
by using 1.06*min(sd(X), IQR(X)/1.34)*n^-0.2.
Or is anyone in favour of the old rule for the sake of backwards
compatibility?
--
Bien cordialement, | Yours sincerely,
Andreï V. Kostyrka.
http://kostyrka.ru,
http://kostyrka.ru/blog
On 23 May 2018 at 12:03, Sven Schreiber <svetosch(a)gmx.net> wrote:
Am 08.12.2017 um 13:24 schrieb Sven Schreiber:
>
> the doc for nadarwat says: "a popular choice is n^-0.2". Why not make
> that the default value officially, such that the third/final arg becomes
> optional?
>
>
I think this was never resolved, because the thread got distracted (by
myself). So what about it?
thanks,
sven
_______________________________________________
Gretl-devel mailing list
Gretl-devel(a)lists.wfu.edu
http://lists.wfu.edu/mailman/listinfo/gretl-devel