On Fri, 31 May 2019, Summers, Peter wrote:
I can confirm that something unexpected's going on:
Thanks, Peter. I've found two relevant differences between the
Wooldridge and POE4 versions of the Mroz data.
First, the POE4 version doesn't contain an "lwage" series, so to do
the sorting comparison it's necessary to add it first:
series lwage = log(wage) # or similar command
Second, in POE the "wage" values for women who are not in the labor
force are given as zero, while in Wooldridge they're given as NA.
In POE, sorting by wage and lwage are not going to produce the same
results at all. Sorting by wage, all the zeros come first. But a zero
wage gives an NA log(wage) and NAs get sorted to the end. Hence the
completely different results you get from the two regressions in my
test script.
If I run Jack's script on the Wooldridge version of mroz.gdt, I
get
"check = 0" in all 4 cases.
But this I can't replicate and don't understand. Are you sure? Could
you try this variant of Jack's test:
<hansl>
open <Wooldrige Mroz data>
scalar check = min(diff(wage)) >= 0 && min(diff(lwage)) >= 0
printf "check: %d (unsorted: should be 0)\n", check
dataset sortby wage
scalar check = min(diff(wage)) >= 0 && min(diff(lwage)) >= 0
printf "check: %d (sorted by wage: should be 1)\n", check
if check != 1
printf "min(diff(wage)) = %g\n", min(diff(wage))
printf "min(diff(lwage)) = %g\n", min(diff(lwage))
endif
</hansl>
Allin