Am 11.03.18 um 15:42 schrieb Allin Cottrell
On Sun, 11 Mar 2018, Sven Schreiber wrote:
On checking the current code, I'm reminding myself that strlen()
does in fact count UTF-8 code points, so "a" and "<a-umlaut>"
will
both give 1.
If we enable nelem() for strings, it would add value if it counted
bytes instead (internally, use the C library's strlen rather than
g_utf8_strlen). One thing we do internally when trying to get
certain strings to line up correctly under translation is compare
strlen and g_utf8_strlen. It's conceivable that package writers
(or at least addon writers) might also want to do
Well I hope that nelem(“”) wouldn’t return 1 then because of a weird C-style/0 byte?
I can see the case for value added, but wouldn’t it be strange if strlen in hansl did
exactly _not_ what strlen in C does?
I ‘d be in favor of a strlen variant (optional switch...) to mimic C’s strlen with byte
counting.
As for nelem then being redundant, the same is true for substr compared to string
slicing.
cheers
Sven