diff doc/interpreter/strings.txi @ 3294:bfe1573bd2ae

[project @ 1999-10-19 10:06:07 by jwe]
author jwe
date Tue, 19 Oct 1999 10:08:42 +0000
parents
children 4f40efa995c1
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/doc/interpreter/strings.txi	Tue Oct 19 10:08:42 1999 +0000
@@ -0,0 +1,484 @@
+@c Copyright (C) 1996, 1997 John W. Eaton
+@c This is part of the Octave manual.
+@c For copying conditions, see the file gpl.texi.
+
+@node Strings, Data Structures, Numeric Data Types, Top
+@chapter Strings
+@cindex strings
+@cindex character strings
+@opindex "
+@opindex '
+
+A @dfn{string constant} consists of a sequence of characters enclosed in
+either double-quote or single-quote marks.  For example, both of the
+following expressions
+
+@example
+@group
+"parrot"
+'parrot'
+@end group
+@end example
+
+@noindent
+represent the string whose contents are @samp{parrot}.  Strings in
+Octave can be of any length.
+
+Since the single-quote mark is also used for the transpose operator
+(@pxref{Arithmetic Ops}) but double-quote marks have no other purpose in
+Octave, it is best to use double-quote marks to denote strings.
+
+@c XXX FIXME XXX -- this is probably pretty confusing.
+
+@cindex escape sequence notation
+Some characters cannot be included literally in a string constant.  You
+represent them instead with @dfn{escape sequences}, which are character
+sequences beginning with a backslash (@samp{\}).
+
+One use of an escape sequence is to include a double-quote
+(single-quote) character in a string constant that has been defined
+using double-quote (single-quote) marks.  Since a plain double-quote
+would end the string, you must use @samp{\"} to represent a single
+double-quote character as a part of the string.  The backslash character
+itself is another character that cannot be included normally.  You must
+write @samp{\\} to put one backslash in the string.  Thus, the string
+whose contents are the two characters @samp{"\} may be written
+@code{"\"\\"} or @code{'"\\'}.  Similarly, the string whose contents are
+the two characters @samp{'\} may be written @code{'\'\\'} or @code{"'\\"}.
+
+Another use of backslash is to represent unprintable characters
+such as newline.  While there is nothing to stop you from writing most
+of these characters directly in a string constant, they may look ugly.
+
+Here is a table of all the escape sequences used in Octave.  They are
+the same as those used in the C programming language.
+
+@table @code
+@item \\
+Represents a literal backslash, @samp{\}.
+
+@item \"
+Represents a literal double-quote character, @samp{"}.
+
+@item \'
+Represents a literal single-quote character, @samp{'}.
+
+@item \a
+Represents the ``alert'' character, control-g, ASCII code 7.
+
+@item \b
+Represents a backspace, control-h, ASCII code 8.
+
+@item \f
+Represents a formfeed, control-l, ASCII code 12.
+
+@item \n
+Represents a newline, control-j, ASCII code 10.
+
+@item \r
+Represents a carriage return, control-m, ASCII code 13.
+
+@item \t
+Represents a horizontal tab, control-i, ASCII code 9.
+
+@item \v
+Represents a vertical tab, control-k, ASCII code 11.
+
+@c We don't do octal or hex this way yet.
+@c
+@c @item \@var{nnn}
+@c Represents the octal value @var{nnn}, where @var{nnn} are one to three
+@c digits between 0 and 7.  For example, the code for the ASCII ESC
+@c (escape) character is @samp{\033}.@refill
+@c 
+@c @item \x@var{hh}@dots{}
+@c Represents the hexadecimal value @var{hh}, where @var{hh} are hexadecimal
+@c digits (@samp{0} through @samp{9} and either @samp{A} through @samp{F} or
+@c @samp{a} through @samp{f}).  Like the same construct in @sc{ansi} C,
+@c the escape 
+@c sequence continues until the first non-hexadecimal digit is seen.  However,
+@c using more than two hexadecimal digits produces undefined results.  (The
+@c @samp{\x} escape sequence is not allowed in @sc{posix} @code{awk}.)@refill
+@end table
+
+Strings may be concatenated using the notation for defining matrices.
+For example, the expression
+
+@example
+[ "foo" , "bar" , "baz" ]
+@end example
+
+@noindent
+produces the string whose contents are @samp{foobarbaz}.  @xref{Numeric
+Data Types} for more information about creating matrices.
+
+@menu
+* Creating Strings::            
+* Searching and Replacing::     
+* String Conversions::          
+* Character Class Functions::   
+@end menu
+
+@node Creating Strings, Searching and Replacing, Strings, Strings
+@section Creating Strings
+
+@deftypefn {Function File} {} blanks (@var{n})
+Return a string of @var{n} blanks.
+@end deftypefn
+
+@deftypefn {Function File} {} int2str (@var{n})
+@deftypefnx {Function File} {} num2str (@var{x})
+Convert a number to a string.  These functions are not very flexible,
+but are provided for compatibility with @sc{Matlab}.  For better control
+over the results, use @code{sprintf} (@pxref{Formatted Output}).
+@end deftypefn
+
+@deftypefn {Built-in Function} {} setstr (@var{x})
+Convert a matrix to a string.  Each element of the matrix is converted
+to the corresponding ASCII 
+character.  For example,
+
+@example
+@group
+setstr ([97, 98, 99])
+     @result{} "abc"
+@end group
+@end example
+@end deftypefn
+
+@deftypefn {Function File} {} strcat (@var{s1}, @var{s2}, @dots{})
+Return a string containing all the arguments concatenated.  For example,
+
+@example
+@group
+s = [ "ab"; "cde" ];
+strcat (s, s, s)
+     @result{} "ab ab ab "
+        "cdecdecde"
+@end group
+@end example
+@end deftypefn
+
+@defvr {Built-in Variable} string_fill_char
+The value of this variable is used to pad all strings in a string matrix
+to the same length.  It should be a single character.  The default value
+is @code{" "} (a single space).  For example,
+
+@example
+@group
+string_fill_char = "X";
+[ "these"; "are"; "strings" ]
+     @result{} "theseXX"
+        "areXXXX"
+        "strings"
+@end group
+@end example
+@end defvr
+
+@deftypefn {Function File} {} str2mat (@var{s_1}, @dots{}, @var{s_n})
+Return a matrix containing the strings @var{s_1}, @dots{}, @var{s_n} as
+its rows.  Each string is padded with blanks in order to form a valid
+matrix.
+
+@strong{Note:}
+This function is modelled after @sc{Matlab}.  In Octave, you can create
+a matrix of strings by @code{[@var{s_1}; @dots{}; @var{s_n}]} even if
+the strings are not all the same length.
+@end deftypefn
+
+@deftypefn {Built-in Function} {} isstr (@var{a})
+Return 1 if @var{a} is a string.  Otherwise, return 0.
+@end deftypefn
+
+@node Searching and Replacing, String Conversions, Creating Strings, Strings
+@section Searching and Replacing
+
+@deftypefn {Function File} {} deblank (@var{s})
+Removes the trailing blanks from the string @var{s}. 
+@end deftypefn
+
+@deftypefn {Function File} {} findstr (@var{s}, @var{t}, @var{overlap})
+Return the vector of all positions in the longer of the two strings
+@var{s} and @var{t} where an occurrence of the shorter of the two starts.
+If the optional argument @var{overlap} is nonzero, the returned vector
+can include overlapping positions (this is the default).  For example,
+
+@example
+findstr ("ababab", "a")
+     @result{} [ 1, 3, 5 ]
+findstr ("abababa", "aba", 0)
+     @result{} [ 1, 5 ]
+@end example
+@end deftypefn
+
+@deftypefn {Function File} {} index (@var{s}, @var{t})
+Return the position of the first occurrence of the string @var{t} in the
+string @var{s}, or 0 if no occurrence is found.  For example,
+
+@example
+index ("Teststring", "t")
+     @result{} 4
+@end example
+
+@strong{Note:}  This function does not work for arrays of strings.
+@end deftypefn
+
+@deftypefn {Function File} {} rindex (@var{s}, @var{t})
+Return the position of the last occurrence of the string @var{t} in the
+string @var{s}, or 0 if no occurrence is found.  For example,
+
+@example
+rindex ("Teststring", "t")
+     @result{} 6
+@end example
+
+@strong{Note:}  This function does not work for arrays of strings.
+@end deftypefn
+
+@deftypefn {Function File} {} split (@var{s}, @var{t})
+Divides the string @var{s} into pieces separated by @var{t}, returning
+the result in a string array (padded with blanks to form a valid
+matrix).  For example,
+
+@example
+split ("Test string", "t")
+     @result{} "Tes "
+        " s  "
+        "ring"
+@end example
+@end deftypefn
+
+@deftypefn {Function File} {} strcmp (@var{s1}, @var{s2})
+Compares two strings, returning 1 if they are the same, and 0 otherwise.
+
+@strong{Note:}  For compatibility with @sc{Matlab}, Octave's strcmp
+function returns 1 if the strings are equal, and 0 otherwise.  This is
+just the opposite of the corresponding C library function.
+@end deftypefn
+
+@deftypefn {Function File} {} strrep (@var{s}, @var{x}, @var{y})
+Replaces all occurrences of the substring @var{x} of the string @var{s}
+with the string @var{y}.  For example,
+
+@example
+strrep ("This is a test string", "is", "&%$")
+     @result{} "Th&%$ &%$ a test string"
+@end example
+@end deftypefn
+
+@deftypefn {Function File} {} substr (@var{s}, @var{beg}, @var{len})
+Return the substring of @var{s} which starts at character number
+@var{beg} and is @var{len} characters long.  For example,
+
+@example
+substr ("This is a test string", 6, 9)
+     @result{} "is a test"
+@end example
+
+@quotation
+@strong{Note:}
+This function is patterned after AWK.  You can get the same result by
+@code{@var{s} (@var{beg} : (@var{beg} + @var{len} - 1))}.  
+@end quotation
+@end deftypefn
+
+@node String Conversions, Character Class Functions, Searching and Replacing, Strings
+@section String Conversions
+
+@deftypefn {Function File} {} bin2dec (@var{s})
+Return a decimal number corresponding to the binary number
+represented as a string of zeros and ones.  For example,
+
+@example
+bin2dec ("1110")
+     @result{} 14
+@end example
+@end deftypefn
+
+@deftypefn {Function File} {} dec2bin (@var{n})
+Return a binary number corresponding the nonnegative decimal number
+@var{n}, as a string of ones and zeros.  For example,
+
+@example
+dec2bin (14)
+     @result{} "1110"
+@end example
+@end deftypefn
+
+@deftypefn {Function File} {} dec2hex (@var{n})
+Return the hexadecimal number corresponding to the nonnegative decimal
+number @var{n}, as a string.  For example,
+
+@example
+dec2hex (2748)
+     @result{} "abc"
+@end example
+@end deftypefn
+
+@deftypefn {Function File} {} hex2dec (@var{s})
+Return the decimal number corresponding to the hexadecimal number stored
+in the string @var{s}.  For example,
+
+@example
+hex2dec ("12B")
+     @result{} 299
+hex2dec ("12b")
+     @result{} 299
+@end example
+@end deftypefn
+
+@deftypefn {Function File} {} str2num (@var{s})
+Convert the string @var{s} to a number.
+@end deftypefn
+
+@deftypefn {Function File} {} toascii (@var{s})
+Return ASCII representation of @var{s} in a matrix.  For example,
+
+@example
+@group
+toascii ("ASCII")
+     @result{} [ 65, 83, 67, 73, 73 ]
+@end group
+
+@end example
+@end deftypefn
+
+@deftypefn {Function File} {} tolower (@var{s})
+Return a copy of the string @var{s}, with each upper-case character
+replaced by the corresponding lower-case one; nonalphabetic characters
+are left unchanged.  For example,
+
+@example
+tolower ("MiXeD cAsE 123")
+     @result{} "mixed case 123"
+@end example
+@end deftypefn
+
+@deftypefn {Function File} {} toupper (@var{s})
+Return a copy of the string @var{s}, with each  lower-case character
+replaced by the corresponding upper-case one; nonalphabetic characters
+are left unchanged.  For example,
+
+@example
+@group
+toupper ("MiXeD cAsE 123")
+     @result{} "MIXED CASE 123"
+@end group
+@end example
+@end deftypefn
+
+@deftypefn {Built-in Function} {} undo_string_escapes (@var{s})
+Converts special characters in strings back to their escaped forms.  For
+example, the expression
+
+@example
+bell = "\a";
+@end example
+
+@noindent
+assigns the value of the alert character (control-g, ASCII code 7) to
+the string variable @code{bell}.  If this string is printed, the
+system will ring the terminal bell (if it is possible).  This is
+normally the desired outcome.  However, sometimes it is useful to be
+able to print the original representation of the string, with the
+special characters replaced by their escape sequences.  For example,
+
+@example
+octave:13> undo_string_escapes (bell)
+ans = \a
+@end example
+
+@noindent
+replaces the unprintable alert character with its printable
+representation.
+@end deftypefn
+
+@defvr {Built-in Variable} implicit_num_to_str_ok
+If the value of @code{implicit_num_to_str_ok} is nonzero, implicit
+conversions of numbers to their ASCII character equivalents are
+allowed when strings are constructed using a mixture of strings and
+numbers in matrix notation.  Otherwise, an error message is printed and
+control is returned to the top level. The default value is 0.  For
+example,
+
+@example
+@group
+[ "f", 111, 111 ]
+     @result{} "foo"
+@end group
+@end example
+@end defvr
+
+@defvr {Built-in Variable} implicit_str_to_num_ok
+If the value of @code{implicit_str_to_num_ok} is nonzero, implicit
+conversions of strings to their numeric ASCII equivalents are allowed.
+Otherwise, an error message is printed and control is returned to the
+top level.  The default value is 0.
+@end defvr
+
+@node Character Class Functions,  , String Conversions, Strings
+@section Character Class Functions
+
+Octave also provides the following character class test functions
+patterned after the functions in the standard C library.  They all
+operate on string arrays and return matrices of zeros and ones.
+Elements that are nonzero indicate that the condition was true for the
+corresponding character in the string array.  For example,
+
+@example
+@group
+isalpha ("!Q@@WERT^Y&")
+     @result{} [ 0, 1, 0, 1, 1, 1, 1, 0, 1, 0 ]
+@end group
+@end example
+
+@deftypefn {Mapping Function} {} isalnum (@var{s})
+Return 1 for characters that are letters or digits (@code{isalpha
+(@var{a})} or @code{isdigit (@var{})} is true).
+@end deftypefn
+
+@deftypefn {Mapping Function} {} isalpha (@var{s})
+Return true for characters that are letters (@code{isupper (@var{a})}
+or @code{islower (@var{})} is true). 
+@end deftypefn
+
+@deftypefn {Mapping Function} {} isascii (@var{s})
+Return 1 for characters that are ASCII (in the range 0 to 127 decimal).
+@end deftypefn
+
+@deftypefn {Mapping Function} {} iscntrl (@var{s})
+Return 1 for control characters.
+@end deftypefn
+
+@deftypefn {Mapping Function} {} isdigit (@var{s})
+Return 1 for characters that are decimal digits.
+@end deftypefn
+
+@deftypefn {Mapping Function} {} isgraph (@var{s})
+Return 1 for printable characters (but not the space character).
+@end deftypefn
+
+@deftypefn {Mapping Function} {} islower (@var{s})
+Return 1 for characters that are lower case letters.
+@end deftypefn
+
+@deftypefn {Mapping Function} {} isprint (@var{s})
+Return 1 for printable characters (including the space character).
+@end deftypefn
+
+@deftypefn {Mapping Function} {} ispunct (@var{s})
+Return 1 for punctuation characters.
+@end deftypefn
+
+@deftypefn {Mapping Function} {} isspace (@var{s})
+Return 1 for whitespace characters (space, formfeed, newline,
+carriage return, tab, and vertical tab).
+@end deftypefn
+
+@deftypefn {Mapping Function} {} isupper (@var{s})
+Return 1 for upper case letters.
+@end deftypefn
+
+@deftypefn {Mapping Function} {} isxdigit (@var{s})
+Return 1 for characters that are hexadecimal digits.
+@end deftypefn