Re: [Gretl-devel] The mysterious stack() function

Thursday, 27 September 2018

Am 27.09.2018 um 02:00 schrieb Allin Cottrell:
...
 On Wed, 26 Sep 2018, Sven Schreiber wrote:

> OK, sorry for the delay. I'm attaching an attempt to read in the data 
> from such a file (as you posted, imf-gdp.csv) and convert it to a 
> gretl matrix right away. This string processing is probably not very 
> efficient, but OTOH it wouldn't be done often, so top speed is not 
> really an issue I guess.
>
> What still has to be done is to shuffle around the various blocks of 
> the matrix to match the wanted dataset layout, but that shouldn't be 
> so difficult.

 Nice work! My only question is: how sure are we that the IMF data 
 files are consistent in their presentation, to the extent that your 
 function will work on any example? Well, also: are there other data 
 sources where panel data are provided in a similar manner (time series 
 running horizontally) and if so, would they also be amenable to your 
 approach? 
The orientation of the data is a matter of the to-be-written second (or 
top-level) function. Probably the user would have to specify whether 
time is in rows or columns and so on.
This present function just reads in a rectangular data array (matrix) 
from a text file, automatically discarding rows that for whatever reason 
differ in their length.
(These could be empty rows or descriptive stuff at the bottom, or rows 
with just the variable names in between.)
This should even cover missing data (unbalanced panels for example) as 
long as the layout is coherent in that missing data (a) either has some 
string code or (b) the entries are simply empty but the entry as such is 
signalled by the correct amount of column separators. But this hasn't 
been tested yet.

So right now I'm somewhat optimistic that it would work.

...
 One other thought: IMO the most desirable replacement for the
existing 
 stack() should probably also be able to handle (I suppose, via some 
 option) the (simpler) case where we have a dataset with one or more 
 blocks of N time-series of length T, where time goes vertically but 
 the individuals are arrayed horizontally, in separate series, and we'd 
 like to stack the N series into one panel series.

 For example, we open a CSV file which has N columns holding GDP 
 1990-2017 for N countries, and we want to panelize GDP. This may be a 
 "join" task; I haven't thought it through yet, but I suspect that even 
 if its doable via join there may be an advantage in automating it via 
 some pstack() variant. 
If I understand correctly the resulting matrix from my present function 
would be TxN in this case, and since gretl's native panel format is 
stacked time series (right?) doing vec(M) should then be enough, as in 
"series GDP = vec(M)", possibly preceded by "nulldata nelem(M) 
--preserve" and "setobs 1 1:1 --stacked-time-series" or something.

cheers,
sven

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [Gretl-devel] The mysterious stack() function