On Thu, 28 May 2009, Sebastián Goinheix wrote:
Hello everyone:
I need to convert a database of individuals in a household one. How can I do that?
Like "egen" of STATA, for example: egen equivh=sum(equiv), by(correlat anio).
I don't find anything in the manual and the user's guide.
Ok, this arguably isn't the most intuitive thing to do, but it's possible
and reasonably general if you interpret your dataset as a panel dataset.
For example, suppose you have a dataset like this:
hhnum indnum x y
1 1 6.4 1
1 2 3.3 4
1 3 8.6 4
2 1 0.6 3
3 1 1.1 1
3 2 6.5 0
where hhnum is the household identifier, indnum is an identifier for each
individual within the household and x and y are your variables of
interest. This is what you can do:
<script>
genr origndx = time
setobs hhnum indnum --panel
hhn = pnobs(origndx)
list indvars = x y
list hhvars = hhn
loop foreach i indvars
hh$i = pmean($i)
list hhvars += hh$i
end loop
smpl indnum=1 --restrict
store hh.gdt hhvars
</script>
A few comments: you may need a variable for keeping track of which
observations were in the original dataset and which weren't. In the
example, I called this "origndx". The basic idea is that first you create
a variable holding the number of individuals in the household (hhn). Then,
you define which variables you want to aggregate in the indvars list and
you create a corresponding list with the household mean via a loop. As the
household-level variables are created, they're added to the "hhvars" list.
Finally, you drop redundant observations via the "smpl" command and save
your results. This is what you'd get:
hhn hhx hhy
3 6.1 3.0
1 0.6 3.0
2 3.8 0.5
This works with the existing version. However, we may consider adding a
function to automate this to some extent. Allin?
Riccardo (Jack) Lucchetti
Dipartimento di Economia
Università Politecnica delle Marche
r.lucchetti(a)univpm.it
http://www.econ.univpm.it/lucchetti