Right now I'm very pressed for time, but I'm putting my test rig at
http://ricardo.ecn.wfu.edu/~cottrell/testing/test-gretl.tar.xz
so anyone can take a look. It exercises almost 20000 scripts; the testing mechanism utilizes "make" and shell scripts. Each directory contains "output" and "newout" subdirectories. The basic idea is that "make" in a given directory will populate the "newout" directory with output files, then run a "diff" on newout vs "output" (besides reporting any failures). So "output" is supposed to contain "known good" results.
You can run everything by typing "make test-all" in the top-level directory. This may take a while; it will produce a composite diff of all new output against all previous output.
Anyone running this on their own system should first go into the "bin" directory (under "test-gretl") and edit the file named "sitevars".