octave: scripts/strings/unicode2native.m comparison

comparison scripts/strings/unicode2native.m @ 32072:f7206b6577c2 stable

unicode2native: Fix conversion to UTF-16 (bug #64139). * liboctave/wrappers/uniconv-wrappers.c (octave_u8_conv_to_encoding_intern): Avoid appending a zero-byte when converting to UTF-* to avoid having to strip a varying number of bytes after the conversion. * scripts/strings/unicode2native.m: Add test for conversion to UTF-16.

author	Markus Mützel <markus.muetzel@gmx.de>
date	Wed, 03 May 2023 20:43:36 +0200
parents	470134b3fc28
children	fab3e312a0b4

comparison

equal deleted inserted replaced

-:bc46d7c2768f
+:f7206b6577c2
 %!assert <*60480> (unicode2native (''), uint8 ([]))
 # short character arrays with invalid UTF-8
 %!testif HAVE_ICONV <*63930>
 %! assert (unicode2native (char (230), 'windows-1252'), uint8 (63));
+%!testif HAVE_ICONV <*63930>
 %! assert (unicode2native (char (249), 'windows-1252'), uint8 (63));
+%!testif HAVE_ICONV <*63930>
 %! assert (unicode2native (char (230:231), 'windows-1252'), uint8 ([63, 63]));
+%!testif HAVE_ICONV <*63930>
 %! assert (unicode2native (char (230:234), 'windows-1252'),
 %!         uint8 ([63, 63, 63, 63, 63]));
+%!testif HAVE_ICONV <*63930>
 %! assert (unicode2native (char ([230, 10]), 'windows-1252'),
 %!         uint8 ([63, 10]));
+# target encoding with surrogates larger than a byte
+%!testif HAVE_ICONV <*64139>
+%! assert (typecast (unicode2native ('abcde',
+%!                                   ['utf-16', nthargout(3, 'computer'), 'e']),
+%!                   'uint16'),
+%!         uint16 (97:101));
 %!error <Invalid call> unicode2native ()
 %!error <called with too many inputs> unicode2native ('a', 'ISO-8859-1', 'test')
 %!error <UTF8_STR must be a character vector> unicode2native (['ab'; 'cd'])
 %!error <UTF8_STR must be a character vector> unicode2native ({1 2 3 4})

Mercurial > octave

comparison scripts/strings/unicode2native.m @ 32072:f7206b6577c2 stable