Am 11.12.2024 um 07:57 schrieb Artur T.:
Am 05.12.24 um 16:05 schrieb Sven Schreiber:
> Am 05.12.2024 um 15:17 schrieb Artur T.:
>>
>> Let me add another point. Currently, we have the following packages:
>>
>> - Friedman.gfn
>> - KruskalWallis.gfn
Actually, there is a 3rd package by Yi-Nung Yang named "mwu" for
conducting the "Mann-Whitney U test with group dummy variable".
Good point, thanks for reminding us. I've done some research in my email
archives and for the record I will reproduce some points here that are
useful to know in this context, I believe:
- The Mann-Whitney test is the same thing as the Wilcoxon rank-sum test.
Unfortunately, it is a bit misleading that in mwu's help text it only
says "Mann-Whitney U test is similar to non-parametric Wilcoxon rank sum
test". (That's because there the parametric t-test is also mentioned, so
it's not wrong, but doesn't mention the equivalence of Mann-Whitney and
rank-sum.)
- Maybe not universally known: gretl does the Wilcoxon rank-sum test on
x and y via the command "difftest x y --rank-sum", but also quite
accessible in the menu via Tools/Non-parametric Tests/ Difference Test
(and then there's the option button Wilcoxon rank sum test).
- What's the remaining purpose of the mwu.gfn package then? It assumes a
different data layout. Instead of having the two variables (or
"samples") in the two variables x and y, it takes the data to be in one
single series, plus a binary indicator series d which says whether an
observation belongs to the first sample or the second one. Below I'm
attaching a hansl script with a working demo and comparison.
- Maybe we could have such a dummy-based interface as an option in the
standard Difference test dialog window as well? The extension would be
to accept a binary dummy as Variable 2, and if that's the case, then
split Variable 1 in two subsamples and run the test on that, like
mwu.gfn does. Perhaps an additional tick box would be needed to actually
convey the meaning of the dummy, don't know.
- (Technical note: the dummy trick shown below works nicely at the hansl
level with the --rank-sum option, but fails due to missing values with
the --sign and the --signed-rank options. Some more data handling would
be needed there, but that's perhaps slightly off-topic here.)
cheers
sven
------
<hansl>
# Alternative function mimicking functionality of the mwu package
function void ranksumtest_withdum(series x, series d "indicator dummy")
errorif(!isdummy(d), "need dummy")
series group_d1 = d ? x : NA
series group_d0 = !d ? x : NA
difftest group_d1 group_d0 --rank-sum
end function
open denmark
# artificial grouping for illustration
series dummy = randgen(b, 0.3, 1) # Bernoulli
ranksumtest_withdum(LRM, dummy)
## compare with mwu
include mwu.gfn
mwu(LRM,dummy)
</hansl>