[Gretl-users] Re: Resample without replacement (Artur Tarassow)

Sunday, 22 September 2019

On Sat, 21 Sep 2019, Sven Schreiber wrote:

...
 Am 21.09.2019 um 18:13 schrieb Allin Cottrell:
> On Sat, 21 Sep 2019, Artur Tarassow wrote:

> Since we've collectively devoted a fair amount of time to this issue
> (the sampling without replacement, not just the timing) I think we
> should try to resolve it as expeditiously as possible.
> 
> Sorry, this is going to be a bit long but I'll try to be concise.

 Thanks, Allin. What I still don't really fully understand is the
 features that are needed (or are likely to be needed). Initially I said
 we don't want ordered draws, but I'm not so sure that that couldn't also
 be a sensible option. 
I suppose it could be -- though my original preference for allowing 
it was really just based on (what I took to be) its relative 
computational simplicity, something I no longer think is very 
important now that I've figured out an efficient algorithm for doing 
the sampling + scrambling in one pass.

...
 Something similar goes for the block length. 
I think sampling by blocks without replacement is really problematic 
and probably not to be messed with. Do you have a response to my 
diagnosis in

https://www.mail-archive.com/gretl-users@gretlml.univpm.it/msg14200.html

The second possible approach I mention there would be 
straightforward to implement but I seriously doubt it would have any 
desirable statistical properties.

...
 Maybe Artur's reference to R's sample function is a hint that
we should
 study more closely what that function does? 
The signature is

   sample(x, size, replace = FALSE, prob = NULL)

where "size" is the number of draws and "prob" is an optional vector 
of weights. As Artur said, sampling by blocks is not supported. 
There's one more optional boolean argument, useHash, which is 
available only if replace = FALSE, prob = NULL, and size <= n/2.
This indicates that "the hash-version of the algorithm should be 
used" and is recommended for large n. Oddly, I don't see any 
indication of whether useHash is the default when its conditions of 
application are met.

Allin

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

[Gretl-users] Re: Resample without replacement (Artur Tarassow)