On Wed, 18 Sep 2019, Sven Schreiber wrote:
Am 18.09.2019 um 21:13 schrieb Allin Cottrell:
> But one further point occurs to me: When people run such a function
> does it meet expectation if the observations/rows are a random subset,
> but in their original order? Or would users expect the order to be
> shuffled? (Of course it's easy enough to shuffle the rows of a matrix
> yourself if you want.)
No, drawing without replacement should not be order-preserving, just
like doing it with replacement. The last index could be drawn first and
then should come first, and so on.
OK -- but if the order of the observations doesn't matter, as in a
cross-sectional dataset, it's computationally cheaper to build a
random array of selected rows then extract them in order.
Anyway, the (still provisional) built-in msample() function now (in
git) shuffles the order by default but takes a third, optional
boolean argument to preserve order: if it has a non-zero value the
selected rows appear in their original order.
Any other thoughts on this?
Allin