On Sat, 12 Dec 2020, Sven Schreiber wrote:
Am 12.12.2020 um 02:30 schrieb Allin Cottrell:
> On Fri, 11 Dec 2020, Sven Schreiber wrote:
>> BTW, I also observed that the display/printing of the string within
>> gretl in the output window was fine. The phenomenon only occurred when
>> saving to a file.
>
> OK, I'll take a look at that.
Here's a slight extension of the previous script, where newtest.txt has
the contents: "no\nsleep\ntill\nBrooklyn", and according to what I see
in the SciTE editor it has CRLF line endings. Counting CRLF as two
characters there are 25 chars in the file (ignoring EOF).
<hansl>
s = readfile("newtest.txt")
eval nelem(s) # gives 25
eval strlen(s) # 25
set verbose off
outfile newtest2.inp
print s
end outfile
s2 = readfile("newtest2.inp")
eval nelem(s2) # 30
eval strlen(s2) # 30
</hansl>
The file got 5 new characters, and closer inspection shows this content:
noCRCRLFsleepCRCRLFtillCRCRLFBrooklynCRLF
This time there's no email involved, just gretl I think!
This whole business is an annoyance and stumbling-block for coders
almost as severe as Microsoft's persistence with an unholy mixture
of 8-bit "system codepage" and UTF-16. They persist in using two
bytes to indicate what everyone else indicates in one byte (start a
new line), because in the stone age this required two operations of
a teletype machine: crank the paper up a line, and return the
print-head to the left.
Getting this "right" in a cross-platform program is a horror-show.
That said, it might now be better in current git and snapshot.
Keyword: "binary mode" or "b" when calling the C library function
fopen() for reading or writing. It does nothing except when you're
handling text files on Windows. But when should you use it, and when
not?
Allin