Am 28.08.2018 um 04:06 schrieb Allin
Cottrell:
Sorry to go on about this, but actually I now see why it _might_
not be considered a bug. The UTF-16 sequence corresponding to
"Anastasia" in Greek letters contains no embedded nul byte, since
each of the Greek letters requires 2 non-empty bytes for its
representation. But the appended ASCII characters will each be
represented by a single "active" byte followed by a nul. (UTF-16
requires at least two bytes for each character, and pads with nuls
as needed.)
So I think what R's error message is trying to say is that the
result of conversion doesn't qualify as a string, where "string"
means a sequence of bytes _terminated_ by a nul byte.
But wouldn't that imply that R considers all UTF-16 strings as
invalid as long as there are some "simple" characters in there? If
that's the case, it would very much defeat the purpose of Unicode
being a superset of more restrictive encodings. So it still sounds
like a bug, no?
cheers,
sven