On Sat, 26 Apr 2014, Andreï | Андрей Викторович wrote:
 I noticed that an attempt to address the cycle variable heavily
depends on
 the use of dollar sign. Consider the following file [...] 
[I'm pushing the example down to insert my comments here.]
The issue raised here is the difference in execution speed between 
variants of a hansl script in which the value of a loop index (say, j) is 
accessed (variant 1) as simply "j" or (variant 2) in the dollar-form, 
"$j". Andrei remarks that it "seems more correct" to use
"$j", yet it 
turns out that this slows things down substantially.
Short answer: Never use the dollar-form for a loop index unless you really 
need string substitution. It is always "more correct" to use the plain 
variable name if possible. Suppose the current value of the index j is 5. 
Then if you put plain "j" into a command or function call you get the 
numerical value 5 when the line is evaluated, while if you use "$j" the 
string "5" is substituted for "$j" in the line in question before it
is 
evaluated (a "macro", so to speak).
Consider one of your calls:
   y = cum(mnormal(T, 1))+5*j
What you want here is just the numerical value of j; it would be a 
roundabout procedure to substitute the string representation of j's
value into the line of hansl first, and then evaluate the line.
And here's why it slows things down substantially. In the context of loops 
we attempt to "compile" (in a limited sense) assignments so they'll run 
faster. But if we find that a given line contains string substitution we 
abandon the attempt, since in principle the final form of the line could 
change in an arbitrary, unknown way from one loop iteration to another.
Briefly, here's a case where you might want to use $j:
   series foo_$j = <some expression that depends on j>
This gives you a way of defining an arbitrary number of series foo_1, 
foo_2, and so on, in a loop, something which could not be done 
conveniently in any other way at present (although we're working on a 
better mechanism).
Allin Cottrell
 <hansl> #Code for DFtaumu.inp
 # Dickey---Fuller's tau_mu distribution
 set stopwatch
 nulldata 10000
 scalar T = 100
 scalar tranches=10
 scalar iters=10000
 matrix y = zeros(T,1)
 matrix DFtaumu = zeros(iters,1)
 matrix RES
 matrix VARB
 loop j=1..tranches --quiet
    loop i=1..iters --quiet
        y = cum(mnormal(T, 1))+5*j # <--- That is the one!
        DFtaumu[i] = (mols(y[3:], ones(T-2,1)~y[2:T-1], &RES,
 &VARB)[2]-1)/VARB[2,2]^0.5
    endloop
    sprintf fn "%02d", $j
    mwrite(DFtaumu, "DFTM-(a)fn.mat")
    printf "Tranche %d of %d: %f s elapsed\n", $j, tranches, $stopwatch
 endloop
 </hansl>
 I am studying the influence of the order of magnitude of the constant on
 the DF \tau_\mu distribution. The numbers are relatively small for
 demonstrative purposes. When I run it through gretlcli or gretlcli-mpi, the
 average running time is 0.2 s per loop of *j* (one tranche).
 gretlcli-mpi DFtaumu.inp
 However, it seems more correct to use the *$j* instead of just *j* in the
 loop, so I modified the 14th line by adding a $:
        y = cum(mnormal(T, 1))+5*$j
 It seems more correct, yet the performance fell abruptly, the average
 running time increased by 50%. When the sample size is increased to
 *nulldata 100000* and *scalar iters=100000*, the average execution time for
 one tranche increases from 2.01 (without $ sign) to 2.85 seconds (with $
 sign).
 Why does such a slight change cause such tremendous fall in performance
 with seemingly no effect on the properties of the output (I compared the
 distributions of DFtaumu and found no difference). Why? What kind of
 further optimisation and improvement should be undertaken in order to avoid
 such pitfalls?