On Wed, 18 Sep 2019, Allin Cottrell wrote:
On Wed, 18 Sep 2019, Sven Schreiber wrote:
> Am 18.09.2019 um 21:13 schrieb Allin Cottrell:
>
>> But one further point occurs to me: When people run such a function
>> does it meet expectation if the observations/rows are a random subset,
>> but in their original order? Or would users expect the order to be
>> shuffled? (Of course it's easy enough to shuffle the rows of a matrix
>> yourself if you want.)
>
> No, drawing without replacement should not be order-preserving, just
> like doing it with replacement. The last index could be drawn first and
> then should come first, and so on.
OK -- but if the order of the observations doesn't matter, as in a
cross-sectional dataset, it's computationally cheaper to build a random array
of selected rows then extract them in order.
Anyway, the (still provisional) built-in msample() function now (in git)
shuffles the order by default but takes a third, optional boolean argument to
preserve order: if it has a non-zero value the selected rows appear in their
original order.
Any other thoughts on this?
I personally don't like very much having both msample() and resample(),
but I can't see an easy way out of this.
-------------------------------------------------------
Riccardo (Jack) Lucchetti
Dipartimento di Scienze Economiche e Sociali (DiSES)
Università Politecnica delle Marche
(formerly known as Università di Ancona)
r.lucchetti(a)univpm.it
http://www2.econ.univpm.it/servizi/hpp/lucchetti
-------------------------------------------------------