[Gretl-devel] "jon" and MIDAS data

Wednesday, 17 January 2018

Thanks to Sven for getting me going on this. The "join" command now 
supports the "spreading" of high-frequency series (as wanted by our 
MIDAS apparatus) in a single operation. This requires use of the 
--aggr option with parameter "spread". There are two acceptable 
forms of usage, illustrated below. (AWM is a quarterly dataset, and 
hamilton monthly.)

open AWM.gdt -q
join hamilton.gdt PC6IT --aggr=spread

open AWM.gdt -q
join hamilton.gdt PCI --data=PC6IT --aggr=spread

In the first case MIDAS series PC6IT_m3, PC6IT_m2 and PC6IT_m1 are 
added. In the second case "PCI" is used as the base name for the 
imported series, giving PCI_m3, PCI_m2 and PCI_m1.

Only one high-frequency series can be imported in a given "join" 
invocation with --aggr=spread (this already implies the writing of 
multiple series in the lower frequency dataset).

An important point to note: this "--aggr=spread" thing (where we map 
from one higher-frequency series to a set of lower-frequency ones) 
relies on finding a known, reliable time-series structure in the 
"outer" data file. Native gretl data files (gdt, gdtb) will be OK, 
and also well-formed gretl-friendly CSV files, but not arbitrary CSV 
files.

One of the pertinent features of "join" is that in general it 
assumes almost nothing about the structure of the outer data file. 
It just crawls across the rows of the file looking for matches, so 
it can extract time-series data from a file that looks nothing like 
"proper" time series. Here's a case in point:

$ cat thing.csv
thing1,thing2,Z
-1,0,1
999,999,2
1981,1,3
1980,1,4
1982,1,5
3556,14,6

$cat join-thing.inp
open data9-7
series yr = $obsmajor
series qtr = $obsminor
join thing.csv Z --ikey=yr,qtr --okey=thing1,thing2
print QNC Z -o

Running join-thing.inp works fine: join plods through the nonsense 
in thing.csv, finds three matches (in random order) and puts the "Z" 
data in the right places in the working dataset.

If you have difficulty importing data MIDAS-style from a given CSV 
file using --aggr=spread you might want to drop back to a more 
agnostic, piece-wise approach (agnostic in the sense of assuming 
less about gretl's ability to detect any time-series sructure that 
might be present). Here's an example of what I mean:

<hansl>
open hamilton.gdt
# create month-of-quarter series for filtering
series mofq = ($obsminor - 1) % 3 + 1
# write example CSV file: the first column holds, e.g. "1973M01"
store test.csv PC6IT mofq
open AWM.gdt -q
# import monthly components one at a time, using a filter
join test.csv PCI_m3 --data=PC6IT --tkey=",%YM%m" --filter="mofq==3"
join test.csv PCI_m2 --data=PC6IT --tkey=",%YM%m" --filter="mofq==2"
join test.csv PCI_m1 --data=PC6IT --tkey=",%YM%m" --filter="mofq==1"
list PCI = PCI_*
setinfo PCI --midas
print PCI_m* -o
</hansl>

The example is artificial in that a time-series CSV file written by 
gretl itself should work OK without special treatment. In fact this 
will work fine:

<hansl>
open hamilton.gdt
store hamilton.csv
open AWM.gdt -q
join hamilton.csv PC6IT --aggr=spread
</hansl>

But you may have to add "helper" columns to a third-party CSV file 
to enable a piece-wise MIDAS join via filtering.

Allin

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

[Gretl-devel] "jon" and MIDAS data