Function to get the powerset
by Henrique Andrade
Dear Gretl Community,
I really stuck trying to define a function that gives a power set of a
set. Suppose I have a set S:
S = {"A", "B", "C"}
The associated power set, P(S), is:
P(S) = {{ }, {"A"}, {"B"}, {"C"}, {"A", "B"}, {"A", "C"}, {"B", "C"},
{"A", "B", "C"}}
All that I can think by now (shame on me!) is this:
strings S = defarray("A", "B", "C")
scalar P_S_len = 2^nelem(S) # the size of the power set
strings P_S = array(P_S_len) # an array with 8 spaces.
Does anyone have any ideas?
Best,
Henrique Andrade
6 years, 2 months
warning: gdt-reading bug
by Allin Cottrell
We've just noticed that a bug was introduced into our code for reading
native gretl .gdt data files in August of this year. The bug should be
triggered only rarely, but we thought it wise to issue a warning.
Description of bug: If a gdt file contains "subnormal" values (that
is, floating point values that are too close to zero to be represented
with the usual precision), then when such a file is read on Linux, the
first subnormal value to be found on a given row (observation) will be
incorrectly copied into the remaining columns (series) on that row.
Example: A gdt file containing 10 series has a subnormal for series
number 5 on row 25. Then when the file is read on Linux, that
subnormal will replace the correct values for series 6 to 10 for
observation 25.
Comment: This won't affect the reading of "primary" data (actual
micro- or macroeconomic measurements), which will never contain
subnormal values (we're talking about absolute values less than 10 to
the minus 307). And the bug is not triggered on MS Windows. However,
subnormal values may be produced by some data transformations (such as
squaring very small numbers, or computing the normal CDF of very big
negative values).
Fix: This is now fixed in the git source for gretl and also the
current snapshots. And we will put out a new release soon, gretl
2015d.
Diagnostic: If you think a dataset may suffer from this problem,
you can run the script checkdata.inp, from
http://ricardo.ecn.wfu.edu/pub/gretl/checkdata.inp
First load the dataset in question. Then open checkdata.inp and run
it. An affected dataset may produce something like this:
<script-output>
Total number of values examined: 164122
Check for subnormal floating-point values
-----------------------------------------
Total number found: 138
Longest (row) sequence: 138
(occurs at obs 210, starting series ID 461)
Number of sequences (of length >= 2): 1
</script-output>
The symptom of a problem is that we find a consecutive sequence of
subnormal values on one or more rows of the dataset. This could occur
for "natural" reasons but it may indicate corruption. Isolated
subnormals don't indicate the bug. And again, most datasets should
contain no subnormal values.
Allin Cottrell
8 years, 2 months
removing nan and inf from a matrix
by Logan Kelly
Hello,
I need to take the log difference of a matrix, i.e. log(M[2 rows(M):,]/M[1:rows(M)-1,]). Unfortunately, M has elements equal to zero. I need to replace the nan's and inf's with 0's. This almost works
M = isnan(M) ? 0 : M
but does not remove inf's. Any sugestions?
8 years, 6 months
Holt-Winters package
by Raul Gimeno
Hello
I've been using the Holt-Winters package but I cannot replicate my
Excel-calculation results with this package.
The starting value from the package for the trend is 245 mine is 166.396. By
running a regression on the full sample I get completely different results
for these starting values, although the same methodology as described in the
help description has been used.
For replication purposes I send my excel spreadsheet and I would be glad to
understand how these starting values have been effectively calculated.
Thank you for your help
Raul Gimeno
**
8 years, 7 months
About (Unstable regression results)
by Hélio Guilherme
As a work in progress we have Regression Tests (Software Development Domain
Language ) being set up at
https://github.com/HelioGuilherme66/gretl/tree/master/gretl-tests/test-gr...
.
There are some missing gretl files (which I hope Allin can provide), there
are no missing applications detection, and structure is designed only for
Linux.
>From these 900 gretl test scripts I have only Failing 34 (some due to
missing file and missing Stata). There are some minor numeric differences,
when comparing the outputs with Allin's provided tests results.
I would like to have more people running the tests, but first would like to
have fixed the missing files, and creating a common testing structure to
Windows, Mac OS X and Linux.
In Linux you will have to do some minor edits:
- correct paths in bin/sitevars
- edit BASE2 variable in bin/refactor_data
I am posting to the user list, because I suspect that my emails are being
blocked to devel.
Best Regards,
Hélio
8 years, 8 months
question - Unstable regression results
by Mikael Postila
Hi
We've been experiencing a problem where exactly same data yields different
regression results depending on A) which computer is used and B) when
regression is being run on same computer.
This is the third time the problem arises (first time was in February, next
was in March and third now in April).
The issue arises typically in location specific variable(s) and is in the
last digit of coefficient.
An example below
Here variable dLAT1km gets a value 2.68357. Using same data but on the next
day coefficient is 2.68356. Everything else is the same.
This particular error has now been replicated on two computers running
regressions today and yesterday!
My colleague is doing a test run later this evening and trying to see if the
problem arises between two runs during the same day.
coefficient std. error t-ratio p-value
----------------------------------------------------------------
const 9.61714 0.0383124 251.0 0.0000 ***
size -0.0360853 0.00170093 -21.22 1.07e-095 ***
size2 0.000328265 2.48074e-05 13.23 2.61e-039 ***
size3 -1.04849e-06 1.14382e-07 -9.167 6.99e-020 ***
age -0.00931640 0.00120263 -7.747 1.14e-014 ***
age2 -0.000252806 4.43133e-05 -5.705 1.23e-08 ***
age3 4.79820e-06 4.62571e-07 10.37 5.91e-025 ***
D_c_erinom 0.0248313 0.00777210 3.195 0.0014 ***
D_c_tyyd -0.0711856 0.00418330 -17.02 3.68e-063 ***
D_c_huonot -0.147710 0.0139738 -10.57 7.66e-026 ***
apartment_sauna 0.0611526 0.00529751 11.54 1.94e-030 ***
Q_7 -0.00308946 0.00694088 -0.4451 0.6563
Q_6 -0.000603071 0.00697798 -0.08642 0.9311
Q_5 -0.0102530 0.00680023 -1.508 0.1317
Q_4 0.00770398 0.00652315 1.181 0.2377
Q_3 0.00443441 0.00680937 0.6512 0.5149
Q_2 -0.00412167 0.00661880 -0.6227 0.5335
Q_1 0.00559893 0.00676218 0.8280 0.4077
dLAT1km 2.68357 1.30054 2.063 0.0391 **
dLON1km -6.45504 3.12653 -2.065 0.0390 **
DP_84 -0.119379 0.0860273 -1.388 0.1653
lot_ownership -0.0894216 0.00785090 -11.39 1.09e-029 ***
d1200 -0.300451 0.00921419 -32.61 1.54e-211 ***
The problem
We run the regression monthly with two different computers [we verify our
valuation by duplicating the process].
In both computers the Gretl version is the same 64-bit, and also the Windows
7 (pro) version is same in both computers. Now we've tested the runs on
following days, and same problem appears even when using same computer. Each
run consists of 480 regressions for different properties (this exercise is
done for property valuation) and the problem occurs in c. 1-6 regressions,
which are not the same on each time - i.e. the problem is rather small. On
portfolio level it was this time 7,246e-7 % - but on property level error
can be even 0,2%.
It is not the monetary value, but the real problem is that we can't repeat
the estimations.
In principle all these regressions ought to be solvable in closed form. Just
wondering if one of the following could be the reason:
- some algorithm is used in order to make calculations faster
- somewhere in the Gretl code a random number generator is used
- some rounding rule applies computer internal clock (odd/even date)
Kind Regards
Mikael Postila, MRICS
Head of Analysis
Orava Funds plc
t. +358 (0)50 347 2373
e. <mailto:Mikael.Postila@oravafunds.com> Mikael.Postila(a)oravafunds.com
a. Fabianinkatu 14B, FI-00100 Helsinki, FINLAND
8 years, 8 months
Features request
by Andreas Zervas
(Sven:)
Right, I should have thought of that. Still I would be interested to see
a direct comparison between the AB approach there and a competing C
model doing the same thing. (It's a bit off-topic here, but if you have
literature pointers they would be welcome.)
I think Caldara - Kamps (ECB WP, 857 I think) is what you would like to see for the US.
Andreas
8 years, 8 months
Feature request
by Andreas Zervas
Am 25.04.2016 um 12:44 schrieb Andreas Zervas:
> coding, while adding identities in systems seems to me that it only
> extends a capability already present in the FIML estimator. But off
> course the programmers know better.
But why do you want to have identities for those estimators where they
are irrelevant, given a situation where the simulation apparatus is not
yet here?
Answer : For 2 reasons. Because for example one can simulate a basic macroeconomic model and do the transformations needed to impose the accounting identities, e.g. the GDP identity. A second reason is that I would like to be able to add debt in a fiscal VAR as exogenous (see Favero - Giavazzi AEJEP 2012), and simulate forecasts where debt follows its path dictated by the identity governing its evolution; I can do it on my own for a couple of models, but a general solution is preferable.
(Sven:)
> Personally I'm not a big fan of AB models, because I never saw a
> convincing case where you absolutely had to distribute your
> contemporaneous restrictions over two matrices.
>
(Andreas:)
> Answer: I work in fiscal policy issues - there the workhorse is the
> Blanchard - Perotti (2002) identification restrictions, that
> correspond to an AB model.
Right, I should have thought of that. Still I would be interested to see
a direct comparison between the AB approach there and a competing C
model doing the same thing. (It's a bit off-topic here, but if you have
literature pointers they would be welcome.)
Answer: As the most common C models are Cholesky restrictions, placing government spending first as is customary does not make a big difference with AB models for fiscal policy. But it matters a lot for taxes.
cheers,
sven
All best,
Andreas
8 years, 8 months
Features request
by Andreas Zervas
Am 24.04.2016 um 16:12 schrieb Andreas Zervas:
>
> I was wondering whether it would be possible to add some features in
> Gretl that some of us will find extremely useful (provided I guess it is
> not such a big burden to code them).
That's a lot of different features, maybe it would help if you named
some priorities. Apart from that, feel free to create items/tickets on
the Sourceforge feature request tracker for gretl so that these things
are not forgotten.
>
> The first is related to systems: I was wondering whether it would be
> possible to add the ability to include identities in systems estimated
> with estimators other than FILM; it would be very useful to have such an
> ability for forecasting / simulation purposes, even if estimation is not
> fully efficient; in any case OLS / IV estimation of systems should give
> more robust results.
This is actually directly connected to your next point, because as you
say identities for OLS/IV etc. are only needed for forecasting or
simulation, not for estimation.
Answer: Truly, they are connected. But I think mimicking model structure in EViews should be much more demanding with respect to coding, while adding identities in systems seems to me that it only extends a capability already present in the FIML estimator. But off course the programmers know better.
>
> A related extension would be to develop a framework to imitate the
> EViews model functionality: add single (estimated or not) equations or
> systems of equations to a bigger model and simulate this structure. I
> understand that this is much more complex from a computational and
> programming point of view than the previous suggestion.
Yes I suggested something like this to Allin and Jack myself some time
ago. There has been some underlying work in gretl's foundations to make
this feasible at some point at the level of Hansl programming/scripting.
(For example making more use of the 'bundle' collection datatype.)
However, there is only so much that Allin and Jack can do.
Note that if forecasting or simulation is done in an extra function
package then the specification of identities would not be necessary in
gretl's 'system' estimation command. Instead you would provide the
information about the identitites in the system to the extended
forecasting package, and that package (not yet existing...) might
internally call gretl's system estimation. Of course, it's all a
question about the user interface of that package / add-on.
>
> Perhaps you could start with systems having recursive structure so as to
> initially avoid the problem of finding the solution satisfying all
> equations in all periods.
Maybe.
>
> A third very useful feature in my opinion relates to graphs: it is very
> useful to have subplots in the same graph, as in matlab / octave.
> graphpg is a substitute with limitations, not least in the number the
> way you put the subplots.
You mean whether it's 2-by-4 or 4-by-2 for example?
Answer: or 3-by-3 etc... The 2 by 4 limit seems to me like a logical, yet arbitrary limit that assumes a A4 page with portrait orientation. Gretl itself allows more subplots (4-by-4 I think) in the case of its native SVAR IRF plotting implementation using the GUI.
> Secondary graph request: is it possible to allow for other linestyles
> like the usual dashes, dots etc in Gretl without editing gnuplot files?
This feature is approaching, there has been some discussion already (on
the devel list I think). It depends on the underlying gnuplot version.
Answer: Very nice - eager to see it.
>
> Finally (to Riccardo only) : please, when it is possible add some
> remaining functionalities to SVAR package. First, there is no reason
> (and should not be very difficult) to have AB models for VECMs (as
> eventually these are also VARs).
Happy to hear that there's demand for the SVAR package. Your request is
fair enough, but at least this variant is not very common I think.
Personally I'm not a big fan of AB models, because I never saw a
convincing case where you absolutely had to distribute your
contemporaneous restrictions over two matrices.
Answer: I work in fiscal policy issues - there the workhorse is the Blanchard - Perotti (2002) identification restrictions, that correspond to an AB model. You can do it with IV estimation, but it is useful to have the structural factorization. Nevertheless, in general you are right.
Secondly, please try to add long run
> restrictions on permanent shocks. Essentially, would it be possible to
> allow more general identification restrictions like in Jmulti or Warne's
> SVAR?
Assuming we're talking about the permanent shocks in a cointegrated SVEC
model, yes there were some issues to be resolved about long-run
restrictions if I remember correctly. I think Jack had a preliminary
paper about something related. This is work-in-progress in principle,
but I will openly admit that progress isn't very quick there.
Answer: Ok, yet it will be great help when it is available.
> A related thing is that it would be nice to have a function to simply
> estimate the structural factorization. I think that the necessary
> ingredients to do that is the variance covariance matrix form the VAR /
> VECM and a pattern matrix holding restrictions and free elements; could
> you have a public function to do just that, or how is it possible to do
> that in the current state of the SVAR package? I tried to do it myself
> that but the code is so complicated that I gave it up.
Here I don't quite understand what you mean. Once you estimate the SVAR,
you always get the estimated A or B or C. What else do you mean?
Answer: Think of a Bayesian VAR with a natural conjugate prior - IRFs etc come from simulation. Starting from a draw of the variance covariance matrix of residuals, the only feasible SVAR is one using a Cholesky decomposition. But having a public function to estimate the contemporaneous matrices without needing the full VAR estimation would make it possible to do this kind of analysis.
>
> Thank you for all the good stuff you have given us so far. I hope these
> are not too much to ask.
>
It all depends on your time-frame and patience ;-)
cheers,
sven
Hope I made myself clearer.
Thank you all,
Andreas
8 years, 8 months
Features request
by Andreas Zervas
Dear Allin and Riccardo,
I was wondering whether it would be possible to add some features in Gretl that some of us will find extremely useful (provided I guess it is not such a big burden to code them).
The first is related to systems: I was wondering whether it would be possible to add the ability to include identities in systems estimated with estimators other than FILM; it would be very useful to have such an ability for forecasting / simulation purposes, even if estimation is not fully efficient; in any case OLS / IV estimation of systems should give more robust results.
A related extension would be to develop a framework to imitate the EViews model functionality: add single (estimated or not) equations or systems of equations to a bigger model and simulate this structure. I understand that this is much more complex from a computational and programming point of view than the previous suggestion.
Perhaps you could start with systems having recursive structure so as to initially avoid the problem of finding the solution satisfying all equations in all periods.
A third very useful feature in my opinion relates to graphs: it is very useful to have subplots in the same graph, as in matlab / octave. graphpg is a substitute with limitations, not least in the number the way you put the subplots.
Secondary graph request: is it possible to allow for other linestyles like the usual dashes, dots etc in Gretl without editing gnuplot files?
Finally (to Riccardo only) : please, when it is possible add some remaining functionalities to SVAR package. First, there is no reason (and should not be very difficult) to have AB models for VECMs (as eventually these are also VARs). Secondly, please try to add long run restrictions on permanent shocks. Essentially, would it be possible to allow more general identification restrictions like in Jmulti or Warne's SVAR?
A related thing is that it would be nice to have a function to simply estimate the structural factorization. I think that the necessary ingredients to do that is the variance covariance matrix form the VAR / VECM and a pattern matrix holding restrictions and free elements; could you have a public function to do just that, or how is it possible to do that in the current state of the SVAR package? I tried to do it myself that but the code is so complicated that I gave it up.
Thank you for all the good stuff you have given us so far. I hope these are not too much to ask.
Andreas Zervas
8 years, 8 months