I was just thinking about it. I vaguely remember a debate we had on one of the two lists at some point (but I may be wrong, I tried googling for it, to no avail) about implementing a mode() function and we got collectively stuck on the case when you have multimodal data. Plus, diff-ing the output of ecdf() wull give you trouble if the mode happens to be the smallest element of the vector.
True, that was a bug. Try this corrected version:
<hansl>
function matrix mode(matrix v)
# v should be a vector
E = ecdf(vec(v))
howmuch = diff(0 | E[,2])[2:] # make sure the 1st is also
diffed
where = imaxc(howmuch)
return E[where, 1] | howmuch[where]
end function
</hansl>
but then, in case of multi-modal data, what should be done is debatable.
This is partly inherited from the imax*/imin* suite of gretl
functions. In 2015 we briefly discussed this on-list. I wrote:
' FWIW, (Python's) Numpy's argmax() and argmin() functions
explicitly note that:
"In case of multiple occurrences of the maximum values, the
indices corresponding to the first occurrence are returned."'
And you answered:
'This makes sense. I'm not sure if in fact we follow this policy,
but if we agree we should I can have a go the the C code to make
sure we do, and of course update the docs.'
It looks as if the gretl docs are still silent on the issue.
Actually I was surprised that apparently imaxc in your example uno
gave the last value, I thought it would return the first of the
multi-modes.
I see you're using the mode function as a heuristic criterion to find the data dimensions; maybe we could figure out something else.
I'm happy to hear suggestions. The max line length didn't work because of the file structure.
thanks,
sven