On Sat, 3 Apr 2010, Leandro Zipitria wrote:
Dear Riccardo,
thanks for your answer. I manage to export from the database a csv file with
the same content I report in the previous mail. So now I can open the file
in Gretl. Now the difficult part is to transform the database in the
adequate way in order to obtain a panel data set.
Thanks for the suggestions about how to design the data set. We know we are
dealing with a difficult conceptual task. I think I will need aditional
dummys, one for each supermarket in order to capture specific effects.
But still, I need to work around the issue of creating the panel.
I'm going back to your original design (sheet 2 of your attached
xls file), where you indicated (if I understood correctly) that
you'd like to treat the supermarket as the cross-sectional "unit",
and each product-price as a distinct variable. Your panel would
then comprise T daily observations for each of n supermarkets (T*n
rows) with m columns holding prices for m goods. (Of course there
may be many missing values if not all supermarkets stock all
commodities.)
Given the structure of the original data this is not particularly
easy to achieve, but it can be done in gretl. Since the task is
complex, I suggest breaking it into two parts:
1) Create a daily time-series for each supermarket (all of the
same length, T), then
2) Stick these time series blocks together to form a panel.
I'm attaching an illustrative gretl script that deals with the
first of these tasks. I'm assuming that
a) You can sort your spreadsheet data by year and month, and read
off the "global" starting and ending months (in your sample file
these are 2007/3 and 2009/9).
a) You then sort the spreadsheet data by supermarket, year and
month (in that order), import the data into gretl, and save the
data as a gretl matrix named pdata.mat:
open supermarkets.xls # or whatever it's called
matrix m = {dataset}
mwrite (m, "pdata.mat")
With that done, you are just about ready to run the attached
script (which may of course need editing to work with your full
data file; it works with your sample file). Look for the comment
# choose your supermarket here
and define supermkt_index to the index number of the supermarket
for which you want a time series on the current run. One more
thing: I'm using a newly added function, monthlen, which is in
gretl CVS and the snapshots for Windows and OS X, but not in the
1.8.7 release.
Allin Cottrell