Mercurial > octave
diff libinterp/corefcn/regexp.cc @ 27537:7dc31256c5e4
Document that regexp* functions need UTF-8 encoded input (bug #35910).
* regexp.cc (Fregexp, Fregexpi, Fregexpreg): Document that the input strings
must be UTF-8 encoded.
* NEWS: Announce support for UTF-8 encoded strings in regexp* functions.
author | Markus Mützel <markus.muetzel@gmx.de> |
---|---|
date | Thu, 17 Oct 2019 20:41:03 +0200 |
parents | 94d490815aa8 |
children | 74173f04d2a3 |
line wrap: on
line diff
--- a/libinterp/corefcn/regexp.cc Mon Oct 21 11:50:20 2019 -0400 +++ b/libinterp/corefcn/regexp.cc Thu Oct 17 20:41:03 2019 +0200 @@ -662,8 +662,8 @@ @deftypefnx {} {[@dots{}] =} regexp (@var{str}, @var{pat}, "@var{opt1}", @dots{}) Regular expression string matching. -Search for @var{pat} in @var{str} and return the positions and substrings of -any matches, or empty values if there are none. +Search for @var{pat} in UTF-8 encoded @var{str} and return the positions and +substrings of any matches, or empty values if there are none. The matched pattern @var{pat} can include any of the standard regex operators, including: @@ -1195,9 +1195,9 @@ Case insensitive regular expression string matching. -Search for @var{pat} in @var{str} and return the positions and substrings of -any matches, or empty values if there are none. @xref{XREFregexp,,regexp}, -for details on the syntax of the search pattern. +Search for @var{pat} in UTF-8 encoded @var{str} and return the positions and +substrings of any matches, or empty values if there are none. +@xref{XREFregexp,,regexp}, for details on the syntax of the search pattern. @seealso{regexp} @end deftypefn */) { @@ -1396,6 +1396,8 @@ The pattern is a regular expression as documented for @code{regexp}. @xref{XREFregexp,,regexp}. +All strings must be UTF-8 encoded. + The replacement string may contain @code{$i}, which substitutes for the ith set of parentheses in the match string. For example,