I believe that the Stata option - i.e. consistency with the ML
estimator - should be the default option because (a) it is then clear
what is going on, and (b) it is possible to compare the results for
the two step estimator with the ML estimator. (I have a general
preference for the ML estimator because the two step estimator often
seems to generate spuriously low standard errors on the
coefficients.) Further, the user should be discouraged from using
the two step estimator with non-matching missing data for the reason
that you give.
But, there is an important secondary consideration. Certain types of
survey design can generate systematic patterns of missing data - this
would include the answers to questions that are only asked if someone
is in the labour force or bought certain goods in the last month. In
such cases, exclusion of all observations with missing data can
seriously compromise the possibility of estimating a model reliably
if the pattern of questions asked/answers is correlated with the
selection probability.
There is an alternative way of analysing such data. It is
straightforward to let a user construct their own two step estimator
with different missing data as follows: (a) estimate the probit model
for selection and generate the Mills ratio as a new variable; then
separately (b) estimate the OLS equation including the Mills ratio as
a dependent variable. The corrections are not so difficult and
anyone following this route explicitly should know what they are
doing and can be warned in the documentation. All that is needed is
an option in the probit model to generate the Mills ratio as a
post-estimation variable. I haven't checked whether it is there
already but it could easily be added.
Gordon Hughes