On Fri, 23 Aug 2013, Sven Schreiber wrote:
1) For the 'realtime_start_date' and
'realtime_end_date' columns
transform the entries from ISO format (%Y-%m-%d) to a number string
(%Y%m%d), removing the hyphens, for example "1983-02-03" ->
"19830203".
2) Also for these columns, replace dots (".", missing values) by
"99999999".
Note that your alfred_test.inp code is "not clever" enough, because
AFAICS it also replaces dots in the value column (in your case, with
"_latest_"), which is unwanted I think. (In the reduced example file,
this didn't occur, but it can in general.)
Also, it's not as easy as just removing all the hyphens in the file,
that would be easy in hansl. But the 'observation_date' column must stay
in ISO date format, as I wrote in yet another message, since specifying
'--time="%Y%m%d"' didn't work.
Here's my take at doing it in hansl (untested), but let's not forget
that the goal is (IMHO) to make the preprocessing unnecessary
altogether, by enabling 'join' do smaller/greater comparisons on ISO
date strings!
I agree, it was just a matter of curiosity. And strategy, in a way:
because the fundamental question here is: would it be a good idea to
enhance hansl up to the point when you could use it to do things on text
files for which one would use tools like perl, python, awk or a bash
script (with the standard unix tools: grep, sed, join, cut etcetera)?
On one hand, this would be very cool; plus, when you write a script like
yours, dependency on python clearly reduces portability somehow (yeah, I
know...). But on the other hand, one of the strengths of hansl is (IMO)
its being tailored on the needs of the applied economist and not even
trying to be the Swiss army knife. Besides, that's precisely why we
invented the "foreign" block.
-------------------------------------------------------
Riccardo (Jack) Lucchetti
Dipartimento di Scienze Economiche e Sociali (DiSES)
Università Politecnica delle Marche
(formerly known as Università di Ancona)
r.lucchetti(a)univpm.it
http://www2.econ.univpm.it/servizi/hpp/lucchetti
-------------------------------------------------------