Am 21.09.2019 um 21:01 schrieb Allin Cottrell:
On Sat, 21 Sep 2019, Allin Cottrell wrote:
> On Sat, 21 Sep 2019, Sven Schreiber wrote:
>
>> What I still don't really fully understand is the features that are
>> needed (or are likely to be needed).
Sorry for the bombardment, but one more observation.
Artur has told us that msample meets his needs (sampling by block not
required); Ioannis has given us a quite detailed account of his needs
and my understanding is that msample would meet those too; and Stefan
has made it plain that bootstrapping (where you might well want sampling
by block) has no use for sampling without replacement.
I realize these instances don't necessarily cover all possible uses --
and I agree that it's worth thinking ahead. But at this point I see a
fairly strong presumption that msample would be sufficient.
I really would very much prefer if somebody more knowledgeable in this
area than myself had to take the counter-position. I'm also one of those
guys who stick to the bootstrap 99% of the times, and have much less
exposure to other resampling techniques.
Having said that, for example in section 4.1 of this article there seems
to be subsampling with a block size b: Linton/Maasoumi/Whang (2003),
https://pdfs.semanticscholar.org/9378/bb9e0ae9c7fbf669132ec07f77931a00b1b...
Or Politis & Romano (1994, Annals) see section 3 (stationary time
series...), where they explicitly seem to consider subsampling for
"blocks of size b of the consecutive data...".
(
https://projecteuclid.org/euclid.aos/1176325770, open access pdf)
The general idea of the "subsampling for dependent data" method seems to
be: Since we have dependent data, we need to draw blocks. This idea is
similar to a block bootstrap. As we do subsampling, in principle we
would look at *all* possible subsamples (non-overlapping, no
replacement). However, the combinatorics may become too heavy --
implicitly here a lot of data are assumed. Thus instead of looking at
everything, in practice a bunch of random subsamples are drawn (without
replacement). This method works as an approximation given the
established theoretical convergence results.
So if somebody wants to do that, she needs random draws of contiguous
blocks without replacement. Since the aim is not to come close to
exhausting the available data (lots of them by assumption), your
argument (that the number of possible drawn blocks would be random) is
correct but irrelevant.
OK, sorry for any mistakes in this proclamation, I'm just an amateur
browsing the literature there. Given that I'm not using these methods
myself, I can also happily live with gretl functions that do not cover
everything in this area, such as the current msample. But I think in
principle those features would not be absurd.
FWIW,
sven