On Fri, 18 Jun 2021, Sven Schreiber wrote:
Am 18.06.2021 um 15:05 schrieb George Matysiak:
> Is it possible to undertake k-fold cross-validation in estimating
> regression equations? Thanks.
>
Hi, for regularized LS under Model/Other linear models you get CV. (For
GUI access it could be that you need a post-2021b snapshot, not sure
right now.) If you want to stick to OLS, I guess you could imitate that
by using Ridge with an extremely loose penalty.
Then there are two contributed packages by Artur (Tarassow),
CvDataSplitter and TsCrossValidation - the latter hasn't been updated in
a while, but actually it could be that he is working on a replacement (?).
My advice is to use Artur's excellent code, but if you just want a
quick-n-dirty solution to take inspiration from, here's one:
<hansl>
set verbose off
open data7-24
list X = const sqft age city
scalar nfolds = 8
series fold = randgen(i, 1, nfolds)
series prederr = 0
loop f = 1 .. nfolds
series d = (fold != $f)
wls d salepric X --quiet
series prederr += d ? 0 : $uhat
endloop
printf "x-validation criterion (%d random folds) = %12.4f\n", nfolds,
sum(prederr^2)
</hansl>
-------------------------------------------------------
Riccardo (Jack) Lucchetti
Dipartimento di Scienze Economiche e Sociali (DiSES)
Università Politecnica delle Marche
(formerly known as Università di Ancona)
r.lucchetti(a)univpm.it
http://www2.econ.univpm.it/servizi/hpp/lucchetti
-------------------------------------------------------