On Sun, 20 Mar 2016, Riccardo (Jack) Lucchetti wrote:
On Sun, 20 Mar 2016, Sven Schreiber wrote:
>
> Am 20.03.2016 um 00:50 schrieb Allin Cottrell:
>>
>> OK, I take Sven to be saying (and also Artur) that if there's
>> any chance that renaming a series within a loop could lead to
>> mischief, the general ban on using the "rename" command within
>> loops should be maintained until further notice.
>
> It's not so much the renaming as such. So if the solution is to
> say "we'll just allow it" I guess I wouldn't mind. But I thought
> you were saying that some other checks have to be introduced in
> order to distinguish allowed from forbidden uses. And my worry is
> about the side effects of those new checks or extra treatments.
[Note: I've put the devel-list in cc, because I guess this
discussion may get quite technical, and that we may want to
continue it there, so sorry for the cross-posting.]
I've begun to think that we may wan to introduce an option to the
loop command to distinguish between a "safe" and a "hi-perf" mode.
Let me explain: initally, the loop .. endloop construct used to
imply a simple repetition of whatever was inside, with only a few
specific caveats and checks: eg, some commands were disallowed on
the ground of common sense (eg, nulldata), some others would
partially change their meaning (eg "print" in progressive mode)
etc.
This has been silently changing for a while: as Hansl has grown in
ambitions, it's become evident that sometimes you must use loops
for specialised tasks which involve very few different
instructions, but should ideally run as fast as possible.
Typically, those instructiona are the various incarnations of
"genr" and, possibly, basic estimation commands such as ols or
var. Think for example of implementing a bootstrap procedure or
MCMC; you'll never need the full array of Hansl syntax inside
those: genr and user-written functions are normally all you need,
but you'll want those instruction to run as fast as possible.
In the past year or so, Allin has been working on "compiling"
loops, that is, change the way hansl handles them going from mere
repetition of their contents to a more efficient treatment. This
is very nice, but very complex, since it's very hard to optimise a
few things the hansl language provides (eg string substitution)
without risking complete breakage.
Therefore, I'm now thinking about the possibility of introducing
something like a --hi-perf option to "loop", which would disallow
some instructions but would make it possible to run the allowed
ones at full speed without having to bother with too many checks.
What do you guys think?
Definitely worth considering, though I'm not sure at this point that
it's the best way to go. (I'm confining this reply to the "devel"
list because, as Jack says, it gets a bit technical.)
In hansl, there are basically three things that can, in principle,
foul up our attempts at compilation (or quasi-compilation) of
statements within loops:
(1) string substitution (using "@"- or "$"- variables that can
change their content from one iteration to the next);
(2) the facility for destroying variables using the "delete"
command; and
(3) the facility for renaming series via "rename".
Point (1) is, I think, taken care of already: if a given statement
within a loop uses string substitution (something we can detect
quite easily) we do not attempt to "compile" it; it always gets
evaluated from scratch each time it comes up for execution. (Note
the implication for writers of hansl scripts: avoid string
substitution if you want your code to run as fast as possible.)
In regard to point (2): in compiled languages such as C it is not
possible to "delete" a variable once you have declared it "on the
stack". It will continue to exist until it "passes out of scope"
according to the rules of the language; there's no way to hasten its
demise.
However (still in C), for objects that use memory allocated "on the
heap" (using functions such as malloc) it is indeed possible to
relinquish that memory, via the free() function. Trying to make use
of a variable after using free() on its associated memory is a
surefire way to crash a C program. C is for grown-ups ("consenting
adults"), but for a language such as hansl we want to ensure that
there's no way a user can crash its implementation. So we have to be
very careful to limit usage of "delete" in hansl such that it cannot
lead to referencing free'd memory. I think that is assured at
present: we have guards in place to disallow "delete" within a loop
when deleting a variable might lead to accessing free'd memory.
That leaves point (3), renaming of series in a loop. I think that
can be brought under control much as with "delete" (perhaps a bit
more easily).
I'm not dismissing Jack's suggestion. If a writer of hansl marked a
loop as "hi-perf" that could be understood as (say) committing the
user NOT to use string-substitution, "delete" or "rename" within the
loop, so if any of these things appeared we could immediately flag
an error. I'm just not sure that this is the best way to proceed, IF
we're able reliably to work around the potentially threatening
features.
Allin