On Fri, 24 Feb 2012, Riccardo (Jack) Lucchetti wrote:
On Thu, 23 Feb 2012, Allin Cottrell wrote:
> I know we discussed gretl 2.0 issues a while back, and I know I need to
> re-read what was said then, so that previous work is not just discarded.
> But in the meantime here are a few thoughts off the top of my head.
>
> One general question: do we want to make a big deal of gretl 2.0, or do we
> want to "do a Linus"?
[ ... ]
I don't think we are in the position to gallantly ignore the psychological
and image implications od a major version number change like Linus did.
Compared to Linux, I'd venture to say that gretl is a slightly less
recognisable brand.
Well, you have a point there ;-)
> * Major new functionality: Well, if we're talking C code,
then at present
> that means stuff that Jack and I will produce. I put my view on this at the
> 2011 gretl conference: I think we now have a good enough baseline that
> people ought to be able to add functionality to gretl in the form of
> function packages and "addons". I certainly stand ready to fix bugs and
> tweak the C code (including the GUI code and the "gretl server"
> infrastructure) to make that easier. But right now I myself have no plans
> to add major econometric functionality in C form. Jack has been working on
> substantial new stuff, but in the form of (brilliant) hansl code rather
> than C.
Thanks for the kind words, but in my experience hansl is absolutely and by
far the best language to work in from an applied econometrician's viewpoint,
so producing nice hansl code for doing even hard stuff is surprisingly easy.
You may think that mine is a slightly biased opinion ...
Not at all, I agree entirely! But seriously, I take your point,
which strengthns the case for the fabled hansl guide. I've thought
about that too, but haven't yet managed to make a start on it.
There are only three areas in which I see the necessity of low-level
code
work (but maybe I'm missing something):
a) Massive parallelisation is definitely the future of scientific
computation. So far, we have cautiously explored some possibilities, but time
will come when properly parallelising the internals of gretl will become
unavoidable. But that's not for 2.0; I see it more as a 3.0 thing.
Agreed, on both counts.
b) Both hansl and gretl (heh) may strongly benefit from setting up an
infrastructure for managing data sets like Stata does. That is, do the things
that, ideally, you'd use a RDBMS for, but you can't ask an applied economist
to study SQL, can you? I'm talking about dataset merging, splitting, sorting,
variable/cases keeping/dropping, etcetera. Anybody who's ever worked with
large micro data bases knows exactly what I'm talking about. Stata is, to my
knowledge, the only econometrics package that attempts to do this and does
it, in my opinion, badly.
Interesting. We do have _some_ such functionality already: what
would you see as most important to add?
c) There may be the case for extending the way data are stored in
gretl from a double-only representation to a more general one.
This would enable us to have string and int variables. Allin and I
talked a little about this in ToruĊ, but this is HUGE. The project
currently contains about 400,000 lines of C code, and my
guesstimate is that at least half of this would have to be
thoroughly revised, if not rewritten. Allin already has done some
rationalisation work in libgretl which makes this a little easier,
but it's a loooooooong way away.
Yes, not for 2.0, I think. But I have thought about this a bit more,
and I'll share those thoughts before long.
> * Purge of bugs and update/completion of documentation: Here I
can really
> get on board. One conception of gretl 2.0 is that it has achieved a degree
> of maturity where we have squashed as many bugs as we can find on an
> extended period of testing, and have documented in a reasonably
> comprehensible and cross-referenced form all that the program can do.
Agree.
I think we may have something like a consensus that reasonably
complete documentation (plus whatever goodies we manage to introduce
between now and that point) would be a good excuse for moving the
pointer to 2.0.
Allin