Re: [Gretl-users] R (foreign language): non-Ascii chars in gretl.loadmat on Windows

Tuesday, 28 August 2018

Am 28.08.2018 um 04:06 schrieb Allin Cottrell:
...

 Sorry to go on about this, but actually I now see why it _might_ not 
 be considered a bug. The UTF-16 sequence corresponding to "Anastasia" 
 in Greek letters contains no embedded nul byte, since each of the 
 Greek letters requires 2 non-empty bytes for its representation. But 
 the appended ASCII characters will each be represented by a single 
 "active" byte followed by a nul. (UTF-16 requires at least two bytes 
 for each character, and pads with nuls as needed.)

 So I think what R's error message is trying to say is that the result 
 of conversion doesn't qualify as a string, where "string" means a 
 sequence of bytes _terminated_ by a nul byte. 
But wouldn't that imply that R considers all UTF-16 strings as invalid 
as long as there are some "simple" characters in there? If that's the 
case, it would very much defeat the purpose of Unicode being a superset 
of more restrictive encodings. So it still sounds like a bug, no?

cheers,
sven

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [Gretl-users] R (foreign language): non-Ascii chars in gretl.loadmat on Windows