Re: [Gretl-devel] basic Julia support

Friday, 22 January 2016

On Fri, 22 Jan 2016, Sven Schreiber wrote:

...
 Am 21.01.2016 um 16:10 schrieb Allin Cottrell:

> Your example shows that recursion is a _lot_ faster in julia; so now we
> want a case where recursion is actually needed.
>

 One more thought on this: What about the "omit --auto" command? I guess
 this could be viewed as something that is done recursively, like this in
 pseudo code:

 function <reduced-equation> omit_one_by_one(<estimated-equation>)
  if min(<signif>) < threshold
    eliminate(coeff_where_min(<signif>))
    omit_one_by_one(<equ_reduced_by_one>)
  else
    return <estimated-equation>
  endif
 end function

 If somebody is already proficient enough in Julia (or some other
 JIT-compiled language), I think it would be interesting to compare the
 speed to gretl's performance there. 
Many things are such that they _can_ be done by recursion (in the 
sense of a function calling itself), or they can be done by a 
non-recursive iteration (as in gretl's "omit --auto"), or possibly 
by a simple closed-form calculation (Fibonacci numbers).

I was suggesting that we might try to think of calculations relevant 
to econometrics that are _best_ solved via recursion, given julia's 
huge advantage in that area; I kinda doubt whether auto-omission 
falls in that category, though if anyone cares to try that would be 
nice.

Another thing to consider: julia is amazingly fast at "general 
computation" (almost as fast as C) but once you start using packages 
-- such as GLM for regression -- you pay a big cost in set-up time, 
and the package code may not be anything like as efficient as the 
built-in functions. Here's a trivial example, compounded of examples 
from the julia GLM documentation:

<julia>
using GLM, RDatasets
form = dataset("datasets","Formaldehyde")
lm1 = fit(LinearModel, OptDen ~ Carb, form)
cycle = dataset("datasets", "LifeCycleSavings")
fm2 = fit(LinearModel, SR ~ Pop15 + Pop75 + DPI + DDPI, cycle)
</julia>

Running this on my i7 machine takes around 5.8 seconds (the "real" 
value from the unix "time" program). Then here's the gretl 
equivalent (after having used R to write out the two datasets as 
.dta files):

<hansl>
open formaldehyde.dta -q
ols optden 0 carb
open lifecycle.dta -q
ols sr 0 pop15 pop75 dpi ddpi
</hansl>

Running time: 0.017 seconds, or 340 times faster.

We may suppose that there's a big fixed cost in the julia case, so 
my next step was to wrap each estimation function/command in a 
loop of 100000 replications (and eliminate the printing). That gave:

julia: 30.928s
gretl:  0.747s

OK, so now gretl is only 40 times as fast. What about a million 
replications?

julia: 4m21.023s
gretl:  0m6.138s

Still 40 x faster, so it's by no means all to do with a fixed set-up 
cost.

Once again, I don't doubt there _are_ computations we could 
outsource to julia with advantage, but it seems clear that running 
regressions via GLM is not one of them.

Allin

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [Gretl-devel] basic Julia support