Am 27.09.2018 um 11:33 schrieb Riccardo
(Jack) Lucchetti:

I was just thinking about it. I vaguely remember a debate we had on one of the two lists at some point (but I may be wrong, I tried googling for it, to no avail) about implementing a mode() function and we got collectively stuck on the case when you have multimodal data. Plus, diff-ing the output of ecdf() wull give you trouble if the mode happens to be the smallest element of the vector.

True, that was a bug. Try this corrected version:

<hansl>

function matrix mode(matrix v)

# v should be a vector

E = ecdf(vec(v))

howmuch = diff(0 | E[,2])[2:] # make sure the 1st is also
diffed

where = imaxc(howmuch)

return E[where, 1] | howmuch[where]

end function

</hansl>

but then, in case of multi-modal data, what should be done is debatable.

This is partly inherited from the imax*/imin* suite of gretl
functions. In 2015 we briefly discussed this on-list. I wrote:

' FWIW, (Python's) Numpy's argmax() and argmin() functions
explicitly note that:

"In case of multiple occurrences of the maximum values, the
indices corresponding to the first occurrence are returned."'

And you answered:

'This makes sense. I'm not sure if in fact we follow this policy,
but if we agree we should I can have a go the the C code to make
sure we do, and of course update the docs.'

It looks as if the gretl docs are still silent on the issue.
Actually I was surprised that apparently imaxc in your example uno
gave the last value, I thought it would return the first of the
multi-modes.

I see you're using the mode function as a heuristic criterion to find the data dimensions; maybe we could figure out something else.

I'm happy to hear suggestions. The max line length didn't work because of the file structure.

thanks,

sven