Responding to Allin's suggestion:
> for series x, in which case all the dummies are generated; and
> also support
>
> list L = dummify(x, val)
>
> which treats 'val' as the omitted category. (That is, the
second
> argument to dummify() is optional).
> That leaves a question: is it easier/more intuitive to read 'val'
> as denoting the val'th category when the distinct values of x are
> ordered, or as the condition x == val? I tend to think the
latter
> is better.
I agree. It is very difficult to ensure that the first option
produces predictable results in a function context when there might be
missing categories. Hence, in practice one would have to adopt the
"list DL = dummify(x, max(x))".
However, without wanting to raise unnecessary difficulties, won't this
imply a change in the use of "dummify(x)" as an argument in,
say, OLS as in "OLS y Z dummify(x)"? At the moment, this
seem to drop one category automatically, so that list Z can contain
const. I assume that this is the backward-incompatible change and
that you let the OLS function deal with linear dependence between Z and
dummify(x).
Gordon