October 2018 - Gretl-devel - gretlml.univpm.it

by Sven Schreiber

Hi, a colleague made me notice the stack() function for handling a specific case of panel data import in section 4.5 of the guide. I must admit I have never been aware of that function, and I have one or two questions here. First, it does not appear in the function index (Gretl Command Reference / Functions proper, or in the built-in function documentation). Is this an oversight, or is there a deeper reason? Secondly, it is well documented in the guide section 4.5, but it appears to be a strange beast: It is not a gretl command (gets function arguments in parentheses, for example), but there are double-dash options such as --offset or --length. I don't remember to have seen something like this in gretl (or hansl) before. I guess the story here is some path dependence of the early days, but I wonder if this area could be cleaned up somehow? thanks, sven

6 years, 10 months

3
30
0 / 0

a very strange case of arima (poor)convergence

by oleg_komashko＠ukr.net

Dear all, The script below illustrate the problem Findings: extremely bad lnl in comparison to --x-12-arima zero values of the 2 last parameters and gradients at all iterations Strangely large scaling factor Note that --x-12-arima gives nice pol. roots and excellent Ljung-Box Q' Also note that with the default $pd for modtest --autocorr obviously fails because zero df open bad_data.gdt #attached smpl 1 194 # note strange zeros for b[y_one] and b[y_two] arima 3 0 0; 1 0 0; diff_series const y_one y_two lnl1 = $lnl modtest --autocorr 5 # compare arima 3 0 0; 1 0 0; diff_series const y_one y_two --x-12-arima lnl2 = $lnl modtest --autocorr 5 eval lnl2 - lnl1 arima 3 0 0; 1 0 0; diff_series const y_one y_two --verbose # Scaling y by 2.18989e+018 !!! /* Iteration 1: loglikelihood = -8292.74884180 Parameters: -2.8152e+016 0.65496 -0.090036 -0.16211 -0.44328 0.00000 0.00000 Gradients: 4.7608e-018 4.0881 1.2538 3.3817 -1.2644 0.00000 0.00000 (norm 7.59e-001) Iteration 2: loglikelihood = -8292.71644305 (steplength = 0.0016) Parameters: -2.8152e+016 0.66150 -0.088030 -0.15670 -0.44531 0.00000 0.00000 Gradients: 4.5624e-018 1.3507 -1.5003 1.0847 -1.4772 0.00000 0.00000 (norm 5.32e-001) */ #etc. etc Oleh

7 years

3
19
0 / 0

the same name global series and package functions

by oleg_komashko＠ukr.net

Dear all, In "shadowing" thread, Allin (correctly) noticed that there is a situation where it is difficult to distinguish between series(int_lag) and function series (int_argument) 1) my suggested answer is incorrect, since we can have function series name (int i) and function list name (int i) In this case it is impossible to distinguish between name(int_lag) and name(int_arg) The worse (note that I have obtained the same on Windows10, 2018c and Ubuntu18.04, today's Git <hansl> include lp-mfx.gfn open keane.gdt -q # the following line simulate the situation when a data file already has 'mlogit_mfx' series series mlogit_mfx = normal() smpl (year==87) --restrict logit status 0 educ exper expersq black --multinomial -q bundle b = mlogit_mfx(status, $xlist, $coeff, $vcv, $sample) lp_mfx_print(&b) catch list z = mlogit_mfx(-1 to -2) ser = mlogit_mfx eval mean(ser(-1)) ########### output ? open keane.gdt -q Read datafile /usr/local/share/gretl/data/misc/keane.gdt ? series mlogit_mfx = normal() Generated series mlogit_mfx (ID 19) ? smpl (year==87) --restrict Full data set: 12723 observations Current sample: 1738 observations ? logit status 0 educ exper expersq black --multinomial -q ? bundle b = mlogit_mfx(status, $xlist, $coeff, $vcv, $sample) ? lp_mfx_print(&b) Multinomial logit marginal effects (evaluated at means of regressors) note: dp/dx based on discrete change for black Outcome 1: (status = 1, Pr = 0.0304) dp/dx s.e. z pval xbar educ 0.010826 0.0018373 5.8924 3.8069e-09 12.549 exper -0.020829 0.0054490 -3.8225 0.00013211 3.4403 expersq 0.0019936 0.00073506 2.7122 0.0066841 17.199 black -0.011001 0.0076838 -1.4318 0.15221 0.37973 Outcome 2: (status = 2, Pr = 0.1434) dp/dx s.e. z pval xbar educ -0.045462 0.0039535 -11.499 1.3298e-30 12.549 exper -0.11360 0.012550 -9.0517 1.4080e-19 3.4403 expersq 0.0076209 0.0016081 4.7392 2.1456e-06 17.199 black 0.065872 0.019682 3.3468 0.00081740 0.37973 Outcome 3: (status = 3, Pr = 0.8263) dp/dx s.e. z pval xbar educ 0.034636 0.0042808 8.0911 5.9137e-16 12.549 exper 0.13443 0.013542 9.9268 3.1829e-23 3.4403 expersq -0.0096146 0.0017275 -5.5654 2.6150e-08 17.199 black -0.054870 0.020909 -2.6243 0.0086834 0.37973 ? catch list z = mlogit_mfx(-1 to -2) > z = mlogit_mfx(-1 to The symbol 'to' is undefined ? ser = mlogit_mfx Generated series ser (ID 20) ? eval mean(ser(-1)) -0.017113513 We see that both running package functions and 'ser = mlogit_mfx' worked ok But in 'list z = mlogit_mfx(-1 to -2)' 'mlogit_mfx' is interpreted as function In my opinion this phenomenon can easily be avoided since no function can have -1 to -2 as argument Much worse is mlogit_mfx(-1) A possible solution is deprecating mlogit_mfx(-1) and substitution for mlogit_mfx(-1 to -1) To the contrast, <hansl> open keane.gdt -q series lp_mfx_print = normal() include lp-mfx.gfn smpl (year==87) --restrict logit status 0 educ exper expersq black --multinomial -q bundle b = mlogit_mfx(status, $xlist, $coeff, $vcv, $sample) lp_mfx_print(&b) <hansl> Note the difference between the two scripts: 1-st a series have the same name with mlogit_mfx() 2-nd a series have the same name with lp_mfx_print() The rest is the same Also note that confusing named series are not inserted anywhere Here I have ? open keane.gdt -q Read datafile /usr/local/share/gretl/data/misc/keane.gdt ? series lp_mfx_print = normal() Generated series lp_mfx_print (ID 19) ? include lp-mfx.gfn /home/oleh/.gretl/functions/lp-mfx.gfn lp-mfx 0.4, 2016-11-10 (Allin Cottrell) ? smpl (year==87) --restrict Full data set: 12723 observations Current sample: 1738 observations ? logit status 0 educ exper expersq black --multinomial -q ? bundle b = mlogit_mfx(status, $xlist, $coeff, $vcv, $sample) ? lp_mfx_print(&b) > lp_mfx_print(&b) Incomplete expression Syntax error Error executing script: halting > lp_mfx_print(&b) May be, a pointer argument? So we need at least two corrections 1) some functions stop working 2) something should be done with series(int_lag) Oleh

7 years

3
23
0 / 0

again processor-detecting dependent behavior in arima

by oleg_komashko＠ukr.net

Dear all, again processor-detecting dependent behavior in arima 1) script eval $sysinfo open bad_data.gdt #attached smpl 1 194 series sty=diff_series/sd(diff_series) list zli = y_one y_two y = sty+6.48 arima 3 0 0; 1 0 0; y 0 zli --verbose Note: on the same pc and os blascore = Prescottl; blascore = Atom 2) pc info Motherboard: CPU Type QuadCore Intel Pentium N3540, 2666 MHz (32 x 83) Motherboard Name Lenovo B50-10 Motherboard Chipset Intel Bay Trail-M System Memory 3978 MB DIMM1: SK hynix HMT451S6BFR8A-PB 4 GB DDR3-1600 DDR3 SDRAM (11-11-11-28 @ 800 MHz) (10-10-10-27 @ 761 MHz) (9-9-9-24 @ 685 MHz) (8-8-8-22 @ 609 MHz) (7-7-7-19 @ 533 MHz) (6-6-6-16 @ 457 MHz) (5-5-5-14 @ 380 MHz) BIOS Type Unknown (04/14/2015) Communication Port Последовательный порт (COM1) 3) output 1, system installed # Output 1, 2018d-git, wordlen = 64 gretl version 2018d-git Current session: 2018-10-26 14:48 ? eval $sysinfo bundle anonymous: nproc = 4 blascore = Prescott hostname = DESKTOP-DE5ESQO os = windows mpi = 0 blas = openblas omp_num_threads = 4 omp = 1 blas_parallel = OpenMP mpimax = 4 wordlen = 64 ? open bad_data.gdt Read datafile C:\Users\Lenovo\Documents\gretl\bad_data.gdt periodicity: 4, maxobs: 204 observations range: 1950:1 to 2000:4 Listing 4 variables: 0) const 1) diff_series 2) y_one 3) y_two ? smpl 1 194 Full data range: 1950:1 - 2000:4 (n = 204) Current sample: 1950:1 - 1998:2 (n = 194) ? series sty=diff_series/sd(diff_series) Generated series sty (ID 4) ? list zli = y_one y_two Generated list zli ? y = sty+6.48 Generated series y (ID 5) ? arima 3 0 0; 1 0 0; y 0 zli --verbose NLS: failed to converge after 1605 iterations Error executing script: halting > arima 3 0 0; 1 0 0; y 0 zli --verbose 4) output 2 the same pc and os, 2018c, portable gretl version 2018c Current session: 2018-10-26 14:50 ? eval $sysinfo bundle anonymous: nproc = 4 blascore = Atom hostname = DESKTOP-DE5ESQO os = windows mpi = 0 blas = openblas omp_num_threads = 4 omp = 1 blas_parallel = OpenMP mpimax = 4 wordlen = 32 ? open bad_data.gdt Read datafile C:\Users\Lenovo\Documents\gretl\bad_data.gdt periodicity: 4, maxobs: 204 observations range: 1950:1 to 2000:4 Listing 4 variables: 0) const 1) diff_series 2) y_one 3) y_two ? smpl 1 194 Full data range: 1950:1 - 2000:4 (n = 204) Current sample: 1950:1 - 1998:2 (n = 194) ? series sty=diff_series/sd(diff_series) Generated series sty (ID 4) ? list zli = y_one y_two Generated list zli ? y = sty+6.48 Generated series y (ID 5) ? arima 3 0 0; 1 0 0; y 0 zli --verbose ARMA initialization: using nonlinear AR model Iteration 1: loglikelihood = -136.938951717 Parameters: 6.4637 0.58059 -0.11046 -0.082556 -0.093679 0.10601 -1.9068 Gradients: 9.2712 10.471 1.3218 -2.0031 -2.1470 3.9715 -2.8401 (norm 3.22e+000) Iteration 2: loglikelihood = -136.689823002 (steplength = 0.0016) Parameters: 6.4785 0.59734 -0.10835 -0.085761 -0.097114 0.11237 -1.9113 Gradients: 3.7130 3.7662 -2.9409 -3.1750 -1.7423 -1.1818 -2.5919 (norm 2.14e+000) Iteration 3: loglikelihood = -136.635783208 (steplength = 0.0016) Parameters: 6.4839 0.60234 -0.11626 -0.092855 -0.10056 0.10774 -1.9166 Gradients: 1.9617 4.8733 1.0840 1.5885 1.2310 3.6716 -0.80868 (norm 1.60e+000) Iteration 4: loglikelihood = -136.600859241 (steplength = 0.008) Parameters: 6.4784 0.61437 -0.13175 -0.090207 -0.091373 0.11082 -1.9254 Gradients: 3.9294 3.0305 2.2406 1.5760 0.43245 2.7506 -0.97313 (norm 2.07e+000) Iteration 5: loglikelihood = -136.590682758 (steplength = 0.008) Parameters: 6.4844 0.61344 -0.13942 -0.082111 -0.082786 0.11045 -1.9365 Gradients: 1.7762 4.1148 3.1480 0.21233 -0.75300 3.1392 -1.2209 (norm 1.57e+000) Iteration 6: loglikelihood = -136.579257175 (steplength = 0.008) Parameters: 6.4832 0.61301 -0.13869 -0.079072 -0.080908 0.11078 -1.9511 Gradients: 2.1798 3.8598 2.7328 -0.11127 -0.24403 3.3903 -0.55423 (norm 1.62e+000) Iteration 7: loglikelihood = -136.558904481 (steplength = 0.04) Parameters: 6.4846 0.59869 -0.13673 -0.098319 -0.057675 0.13257 -1.9918 Gradients: 1.7338 3.3469 1.7212 0.78241 -1.7498 3.2189 -0.55661 (norm 1.47e+000) Iteration 8: loglikelihood = -136.521260314 (steplength = 0.04) Parameters: 6.4876 0.56033 -0.15574 -0.11922 -0.052945 0.20112 -2.0607 Gradients: 0.32695 2.4407 0.16140 -1.6302 -2.7718 -1.1861 -1.7788 (norm 1.05e+000) Iteration 9: loglikelihood = -136.495894655 (steplength = 1) Parameters: 6.4896 0.58297 -0.15220 -0.11017 -0.066710 0.18134 -2.0469 Gradients: -0.43210 -0.65107 -0.17364 0.15633 0.59707 0.32961 0.36414 (norm 7.63e-001) Iteration 10: loglikelihood = -136.490592511 (steplength = 1) Parameters: 6.4886 0.57213 -0.15608 -0.11690 -0.061996 0.19671 -2.0648 Gradients: -0.097512 -0.22807 -0.23333 -0.23077 -0.058053 -0.28839 -0.074467 (norm 3.86e-001) Iteration 11: loglikelihood = -136.490080201 (steplength = 1) Parameters: 6.4884 0.56792 -0.15787 -0.11988 -0.060088 0.20265 -2.0736 Gradients: -0.026271 -0.042462 -0.11705 -0.17211 -0.13359 -0.25929 -0.11138 (norm 2.74e-001) Iteration 12: loglikelihood = -136.489973117 (steplength = 1) Parameters: 6.4883 0.56748 -0.15809 -0.12036 -0.059698 0.20303 -2.0754 Gradients: 0.014754 0.015626 -0.015933 -0.033397 -0.049127 -0.037086 -0.030232 (norm 1.62e-001) Iteration 13: loglikelihood = -136.489964277 (steplength = 1) Parameters: 6.4883 0.56727 -0.15829 -0.12056 -0.059753 0.20348 -2.0763 Gradients: -0.0068714 -0.0023991 6.6724e-006 0.00028924 0.0033536 -0.0024461 0.00092363 (norm 8.33e-002) Iteration 14: loglikelihood = -136.489964105 (steplength = 1) Parameters: 6.4883 0.56731 -0.15825 -0.12052 -0.059747 0.20339 -2.0762 Gradients: 0.0010810-8.4809e-005 1.7192e-005 0.00019304 0.00019005 0.00043713 0.00024676 (norm 3.32e-002) Iteration 15: loglikelihood = -136.489964103 (steplength = 1) Parameters: 6.4883 0.56731 -0.15825 -0.12053 -0.059747 0.20339 -2.0762 Gradients: -0.00012818 5.0844e-005 2.7890e-005 2.0559e-005-3.7774e-005 2.5363e-005 -3.0714e-005 (norm 1.16e-002) Iteration 15: loglikelihood = -136.489964103 (steplength = 1) Parameters: 6.4883 0.56731 -0.15826 -0.12053 -0.059747 0.20339 -2.0762 Gradients: -0.00012818 5.0844e-005 2.7890e-005 2.0559e-005-3.7774e-005 2.5363e-005 -3.0714e-005 (norm 1.16e-002) --- FINAL VALUES: loglikelihood = -136.489964103 (steplength = 5.12e-007) Parameters: 6.4883 0.56731 -0.15826 -0.12053 -0.059747 0.20339 -2.0762 Gradients: -0.00012818 5.0844e-005 2.7890e-005 2.0559e-005-3.7774e-005 2.5363e-005 -3.0714e-005 (norm 1.16e-002) Function evaluations: 47 Evaluations of gradient: 15 Model 1: ARMAX, using observations 1950:1-1998:2 (T = 194) Estimated using AS 197 (exact ML) Dependent variable: y Standard errors based on Hessian coefficient std. error z p-value --------------------------------------------------------- const 6.48832 0.0467055 138.9 0.0000 *** phi_1 0.567313 0.159434 3.558 0.0004 *** phi_2 −0.158255 0.107503 −1.472 0.1410 phi_3 −0.120526 0.130267 −0.9252 0.3549 Phi_1 −0.0597470 0.121899 −0.4901 0.6240 y_one 0.203395 0.233933 0.8695 0.3846 y_two −2.07615 0.389196 −5.334 9.58e-08 *** Mean dependent var 6.480000 S.D. dependent var 1.000000 Mean of innovations −0.001117 S.D. of innovations 0.488410 Log-likelihood −136.4900 Akaike criterion 288.9799 Schwarz criterion 315.1228 Hannan-Quinn 299.5659 Real Imaginary Modulus Frequency ----------------------------------------------------------- AR Root 1 1.0476 -1.1562 1.5602 -0.1328 Root 2 1.0476 1.1562 1.5602 0.1328 Root 3 -3.4083 0.0000 3.4083 0.5000 AR (seasonal) Root 1 -16.7372 0.0000 16.7372 0.5000 ----------------------------------------------------------- Oleh

7 years

1
1
0 / 0

missing values being forward-filled in lagged variables

by Pozdeev, Igor

Hi all, This looks like a feature but is a bug by my standards: missing values are being forward-filled when lags are taken of variables. In the screenshot attached, the leftmost panel is the original variable, with two missing values, the central panel is the same variables shifted by one period, and the third - the same variables shifted by two periods. Observe how the value 7.1500 fills in the gaps. [cid:image001.png@01D46D13.CE083E80] Is there a reason for this behavior? Best, Igor Igor Pozdeev Visiting Scholar NYU Stern School of Business 44 West 4th Street, 9-66 New York, NY 10012 +1-917-657-1120 www.igorpozdeev.me<http://www.igorpozdeev.me/>

7 years

3
6
0 / 0

bread() issue with empty list

by Artur T.

Dear all, I am currently using Gretl 2018d-git (2018-10-11) on Win10 (actually the same happens on Linux). Trying to read a bundle where some list is empty results in an error: <hansl> clear open denmark.gdt -q bundle b = null list L = LRM b.L = L list X = null # putting "LRY" into the list would work b.X = X print b bwrite(b, "foob") bundle b2 = bread("foob") print b2 # results in: 'X': got NULL data value </hansl> Best, Artur

7 years

2
2
0 / 0

Adding the EIA database via the API

by Johannes Lips

Hi all, I just stumbled across the fact, that the EIA is providing some or most of their data through an API. [1] I noticed that you need to register, but perhaps there might be a way around it, when we get in touch with them and explain the possible use case and the options to them. I don't know if there's a proper process for adding possible new databases to the gretl database offerings, but I wanted to explore the possibility if we could access their API with gretl. All the best Johannes [1] https://www.eia.gov/opendata/register.php

7 years

3
6
0 / 0

reserved words as names in foreign datasets

by oleg_komashko＠ukr.net

Dear all, Below is content of attached data file index,ltm,x1,x2,x3,ols 1,0.098527655736295,0.14198429261587,-0.191116784992981,0.22224443607502,-1.13946634223632 Of course trying to open it generates using delimiter ',' longest line: 92 characters first field: 'index' number of columns = 6 number of variables: 6 number of non-blank lines: 2 scanning for variable names... line: index,ltm,x1,x2,x3,ols 'ols' is a reserved word For .csv I can easy change 'ols' anywhere For .xls(x) I have LibreOffice if I use Ubuntu, or if I do not want to be a pirate How about stata, etc? I think, gretl could change such names I mean only reserved words: this would open files that used to fail to be opened So a new possibility and no backward-incompatible changes If substitute ols_ for ols everything works Oleh

7 years

1
0
0 / 0

"shadowing" over-diagnostics

by oleg_komashko＠ukr.net

Dear all, If we create a function having a name coinsiding with the name of a local variable in a package, then calling a package function would print In regard to function function_name (package package_nam): Warning: 'some_name' shadows a function of the same name Example <hansl> include lp-mfx.gfn function void den(scalar x) eval floor(x) end function open keane.gdt -q smpl (year==87) --restrict logit status 0 educ exper expersq black --multinomial -q bundle b = mlogit_mfx(status, $xlist, $coeff, $vcv, $sample) <hansl> The cause: The package function mlogit_pj() has a local variable named "den" Oleh

7 years, 1 month

4
20
0 / 0

constant mis-specification and arima convergence problem

by oleg_komashko＠ukr.net

Dear all, ########## redundant constant open denmark.gdt # note a decent model: arima 2 1 0; LRM --nc modtest --autocorr 4 # # Ljung-Box Q' = 2.40961, # with p-value = P(Chi-square(2) > 2.40961) = 0.2997 # Real Imaginary Modulus Frequency # ----------------------------------------------------------- # AR # Root 1 -1.5828 0.0000 1.5828 0.5000 # Root 2 1.4501 0.0000 1.4501 0.0000 # ----------------------------------------------------------- # so models below # are seasonally over-differenced set bfgs_toler default catch arima 1 1 0; 0 1 1; LRM err = $error eval errmsg(err) arima 1 1 0; 0 1 1; LRM --nc ma = 10^-5|$coeff tol = 10^-9 set bfgs_toler tol set initvals ma arima 1 1 0; 0 1 1; LRM --verbose arima 1 1 0; 0 1 1; LRM --x-12-arima # Note # Scaling y by 93455.8 set bfgs_toler default catch arima 1 1 0; 1 1 0; LRM err = $error eval errmsg(err) # arima 1 1 0; 1 1 0; LRM --verbose # Scaling y by 93455.8 ols diff(LRM) 0 -s arima 1 1 0; 1 1 0; LRM --nc ma = 10^-5|$coeff set bfgs_toler default set initvals ma arima 1 1 0; 1 1 0; LRM arima 1 1 0; 1 1 0; LRM --x-12-arima # here we have almost the same estimates # and log-lik # pol. roots are very decent #### the main indicator is high p-value at const # So another situation with bad # convergence is when we have a const # with very high p-value # i.e. redundant const # some indicators: slowly growing lnl # and small gradients ############# missing constant open greene5_1.gdt logs * # a decent model arima 0 1 2; l_realcons # looking at p-value for theta_1 (0.9356) # select matrix qvec = {2} arima 0 1 qvec; l_realcons modtest --autocorr # Test for autocorrelation up to order 4 # # Ljung-Box Q' = 1.34035, # with p-value = P(Chi-square(3) > 1.34035) = 0.7196 # pol. roots are well behaved arima 0 1 qvec; l_realcons --nc modtest --autocorr # no problem with convergence but autocorrelation # We consider arima 1 1 1; l_realcons modtest --autocorr # well-behaved roots but autocorrelation # Note p-value on const is 1.05e-029 set bfgs_toler default scalar reallyhuge = 2*$huge set bfgs_maxgrad reallyhuge arima 1 1 1; l_realcons --nc arima 1 1 1; l_realcons --nc --x-12-arima # some indicators: large norm of gradient # and nearly unit circle roots Oleh

7 years, 1 month

2
3
0 / 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Gretl-devel October 2018