Mercurial > octave-antonio
annotate doc/interpreter/strings.txi @ 7984:bbaa5d7d0143
Some documentation updates
author | David Bateman <dbateman@free.fr> |
---|---|
date | Mon, 28 Jul 2008 15:47:40 +0200 |
parents | b2fbb393a072 |
children | 6f2d95255911 |
rev | line source |
---|---|
7018 | 1 @c Copyright (C) 1996, 1997, 1999, 2000, 2002, 2003, 2004, 2005, |
2 @c 2006, 2007 John W. Eaton | |
3 @c | |
4 @c This file is part of Octave. | |
5 @c | |
6 @c Octave is free software; you can redistribute it and/or modify it | |
7 @c under the terms of the GNU General Public License as published by the | |
8 @c Free Software Foundation; either version 3 of the License, or (at | |
9 @c your option) any later version. | |
10 @c | |
11 @c Octave is distributed in the hope that it will be useful, but WITHOUT | |
12 @c ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or | |
13 @c FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License | |
14 @c for more details. | |
15 @c | |
16 @c You should have received a copy of the GNU General Public License | |
17 @c along with Octave; see the file COPYING. If not, see | |
18 @c <http://www.gnu.org/licenses/>. | |
3294 | 19 |
4167 | 20 @node Strings |
3294 | 21 @chapter Strings |
22 @cindex strings | |
23 @cindex character strings | |
24 @opindex " | |
25 @opindex ' | |
26 | |
27 A @dfn{string constant} consists of a sequence of characters enclosed in | |
28 either double-quote or single-quote marks. For example, both of the | |
29 following expressions | |
30 | |
31 @example | |
32 @group | |
33 "parrot" | |
34 'parrot' | |
35 @end group | |
36 @end example | |
37 | |
38 @noindent | |
39 represent the string whose contents are @samp{parrot}. Strings in | |
40 Octave can be of any length. | |
41 | |
42 Since the single-quote mark is also used for the transpose operator | |
43 (@pxref{Arithmetic Ops}) but double-quote marks have no other purpose in | |
44 Octave, it is best to use double-quote marks to denote strings. | |
45 | |
6554 | 46 @cindex escape sequence notation |
47 In double-quoted strings, the backslash character is used to introduce | |
6623 | 48 @dfn{escape sequences} that represent other characters. For example, |
6554 | 49 @samp{\n} embeds a newline character in a double-quoted string and |
50 @samp{\"} embeds a double quote character. | |
3294 | 51 |
6554 | 52 In single-quoted strings, backslash is not a special character. |
53 | |
54 Here is an example showing the difference | |
3294 | 55 |
6554 | 56 @example |
6556 | 57 @group |
6554 | 58 toascii ("\n") |
6570 | 59 @result{} 10 |
6554 | 60 toascii ('\n') |
6570 | 61 @result{} [ 92 110 ] |
6556 | 62 @end group |
6554 | 63 @end example |
3294 | 64 |
6554 | 65 You may also insert a single quote character in a single-quoted string |
66 by using two single quote characters in succession. For example, | |
67 | |
68 @example | |
69 'I can''t escape' | |
6570 | 70 @result{} I can't escape |
6554 | 71 @end example |
3294 | 72 |
73 Here is a table of all the escape sequences used in Octave. They are | |
74 the same as those used in the C programming language. | |
75 | |
76 @table @code | |
77 @item \\ | |
78 Represents a literal backslash, @samp{\}. | |
79 | |
80 @item \" | |
81 Represents a literal double-quote character, @samp{"}. | |
82 | |
83 @item \' | |
84 Represents a literal single-quote character, @samp{'}. | |
85 | |
3893 | 86 @item \0 |
4946 | 87 Represents the ``nul'' character, control-@@, ASCII code 0. |
3893 | 88 |
3294 | 89 @item \a |
90 Represents the ``alert'' character, control-g, ASCII code 7. | |
91 | |
92 @item \b | |
93 Represents a backspace, control-h, ASCII code 8. | |
94 | |
95 @item \f | |
96 Represents a formfeed, control-l, ASCII code 12. | |
97 | |
98 @item \n | |
99 Represents a newline, control-j, ASCII code 10. | |
100 | |
101 @item \r | |
102 Represents a carriage return, control-m, ASCII code 13. | |
103 | |
104 @item \t | |
105 Represents a horizontal tab, control-i, ASCII code 9. | |
106 | |
107 @item \v | |
108 Represents a vertical tab, control-k, ASCII code 11. | |
109 | |
110 @c We don't do octal or hex this way yet. | |
111 @c | |
112 @c @item \@var{nnn} | |
113 @c Represents the octal value @var{nnn}, where @var{nnn} are one to three | |
114 @c digits between 0 and 7. For example, the code for the ASCII ESC | |
115 @c (escape) character is @samp{\033}.@refill | |
116 @c | |
117 @c @item \x@var{hh}@dots{} | |
118 @c Represents the hexadecimal value @var{hh}, where @var{hh} are hexadecimal | |
119 @c digits (@samp{0} through @samp{9} and either @samp{A} through @samp{F} or | |
120 @c @samp{a} through @samp{f}). Like the same construct in @sc{ansi} C, | |
121 @c the escape | |
122 @c sequence continues until the first non-hexadecimal digit is seen. However, | |
123 @c using more than two hexadecimal digits produces undefined results. (The | |
124 @c @samp{\x} escape sequence is not allowed in @sc{posix} @code{awk}.)@refill | |
125 @end table | |
126 | |
127 Strings may be concatenated using the notation for defining matrices. | |
128 For example, the expression | |
129 | |
130 @example | |
131 [ "foo" , "bar" , "baz" ] | |
132 @end example | |
133 | |
134 @noindent | |
135 produces the string whose contents are @samp{foobarbaz}. @xref{Numeric | |
3402 | 136 Data Types}, for more information about creating matrices. |
3294 | 137 |
138 @menu | |
6624 | 139 * Creating Strings:: |
140 * Comparing Strings:: | |
141 * Manipulating Strings:: | |
3294 | 142 * String Conversions:: |
143 * Character Class Functions:: | |
144 @end menu | |
145 | |
4167 | 146 @node Creating Strings |
3294 | 147 @section Creating Strings |
148 | |
6623 | 149 The easiest way to create a string is, as illustrated in the introduction, |
150 to enclose a text in double-quotes or single-quotes. It is however | |
151 possible to create a string without actually writing a text. The | |
152 function @code{blanks} creates a string of a given length consisting | |
153 only of blank characters (ASCII code 32). | |
154 | |
3361 | 155 @DOCSTRING(blanks) |
3294 | 156 |
6623 | 157 The string representation used by Octave is an array of characters, so |
158 the result of @code{blanks(10)} is actually a row vector of length 10 | |
159 containing the value 32 in all places. This lends itself to the obvious | |
160 generalisation to character matrices. Using a matrix of characters, it | |
161 is possible to represent a collection of same-length strings in one | |
162 variable. The convention used in Octave is that each row in a | |
163 character matrix is a separate string, but letting each column represent | |
164 a string is equally possible. | |
165 | |
166 The easiest way to create a character matrix is to put several strings | |
167 together into a matrix. | |
168 | |
169 @example | |
170 collection = [ "String #1"; "String #2" ]; | |
171 @end example | |
172 | |
173 @noindent | |
174 This creates a 2-by-9 character matrix. | |
175 | |
176 One relevant question is, what happens when character matrix is | |
177 created from strings of different length. The answer is that Octave | |
178 puts blank characters at the end of strings shorter than the longest | |
179 string. While it is possible to use a different character than the | |
180 blank character using the @code{string_fill_char} function, it shows | |
181 a problem with character matrices. It simply isn't possible to | |
182 represent strings of different lengths. The solution is to use a cell | |
183 array of strings, which is described in @ref{Cell Arrays of Strings}. | |
184 | |
4358 | 185 @DOCSTRING(char) |
186 | |
3361 | 187 @DOCSTRING(strcat) |
3294 | 188 |
6502 | 189 @DOCSTRING(strvcat) |
190 | |
191 @DOCSTRING(strtrunc) | |
192 | |
3361 | 193 @DOCSTRING(string_fill_char) |
3294 | 194 |
3361 | 195 @DOCSTRING(str2mat) |
3294 | 196 |
4535 | 197 @DOCSTRING(ischar) |
198 | |
6502 | 199 @DOCSTRING(mat2str) |
200 | |
201 @DOCSTRING(num2str) | |
3294 | 202 |
6623 | 203 @DOCSTRING(int2str) |
204 | |
205 @node Comparing Strings | |
206 @section Comparing Strings | |
207 | |
208 Since a string is a character array comparison between strings work | |
209 element by element as the following example shows. | |
210 | |
211 @example | |
212 GNU = "GNU's Not UNIX"; | |
213 spaces = (GNU == " ") | |
7031 | 214 @result{} spaces = |
215 0 0 0 0 0 1 0 0 0 1 0 0 0 0 | |
6623 | 216 @end example |
217 | |
218 @noindent | |
219 To determine if two functions are identical it is therefore necessary | |
220 to use the @code{strcmp} or @code{strncpm} functions. Similar | |
7001 | 221 functions exist for doing case-insensitive comparisons. |
6623 | 222 |
223 @DOCSTRING(strcmp) | |
224 | |
225 @DOCSTRING(strcmpi) | |
226 | |
227 @DOCSTRING(strncmp) | |
228 | |
229 @DOCSTRING(strncmpi) | |
230 | |
7984
bbaa5d7d0143
Some documentation updates
David Bateman <dbateman@free.fr>
parents:
7639
diff
changeset
|
231 @DOCSTRING(validstring) |
bbaa5d7d0143
Some documentation updates
David Bateman <dbateman@free.fr>
parents:
7639
diff
changeset
|
232 |
6623 | 233 @node Manipulating Strings |
234 @section Manipulating Strings | |
235 | |
236 Octave supports a wide range of functions for manipulating strings. | |
237 Since a string is just a matrix, simple manipulations can be accomplished | |
238 using standard operators. The following example shows how to replace | |
239 all blank characters with underscores. | |
240 | |
241 @example | |
7081 | 242 quote = ... |
243 "First things first, but not necessarily in that order"; | |
6623 | 244 quote( quote == " " ) = "_" |
7081 | 245 @result{} quote = |
246 First_things_first,_but_not_necessarily_in_that_order | |
6623 | 247 @end example |
248 | |
249 For more complex manipulations, such as searching, replacing, and | |
7001 | 250 general regular expressions, the following functions come with Octave. |
3294 | 251 |
3361 | 252 @DOCSTRING(deblank) |
3294 | 253 |
3361 | 254 @DOCSTRING(findstr) |
3294 | 255 |
3361 | 256 @DOCSTRING(index) |
3294 | 257 |
3361 | 258 @DOCSTRING(rindex) |
3294 | 259 |
6502 | 260 @DOCSTRING(strfind) |
261 | |
262 @DOCSTRING(strmatch) | |
263 | |
264 @DOCSTRING(strtok) | |
265 | |
3361 | 266 @DOCSTRING(split) |
3294 | 267 |
3361 | 268 @DOCSTRING(strrep) |
3294 | 269 |
3361 | 270 @DOCSTRING(substr) |
3294 | 271 |
5582 | 272 @DOCSTRING(regexp) |
273 | |
274 @DOCSTRING(regexpi) | |
275 | |
6549 | 276 @DOCSTRING(regexprep) |
277 | |
7984
bbaa5d7d0143
Some documentation updates
David Bateman <dbateman@free.fr>
parents:
7639
diff
changeset
|
278 @DOCSTRING(regexptranslate) |
bbaa5d7d0143
Some documentation updates
David Bateman <dbateman@free.fr>
parents:
7639
diff
changeset
|
279 |
4167 | 280 @node String Conversions |
3294 | 281 @section String Conversions |
282 | |
6623 | 283 Octave supports various kinds of conversions between strings and |
284 numbers. As an example, it is possible to convert a string containing | |
285 a hexadecimal number to a floating point number. | |
286 | |
287 @example | |
288 hex2dec ("FF") | |
289 @result{} ans = 255 | |
290 @end example | |
291 | |
3361 | 292 @DOCSTRING(bin2dec) |
3294 | 293 |
3361 | 294 @DOCSTRING(dec2bin) |
3294 | 295 |
3361 | 296 @DOCSTRING(dec2hex) |
3294 | 297 |
3361 | 298 @DOCSTRING(hex2dec) |
3294 | 299 |
3920 | 300 @DOCSTRING(dec2base) |
301 | |
302 @DOCSTRING(base2dec) | |
303 | |
7639
b2fbb393a072
Add the num2hex and hex2num functions
David Bateman <dbateman@free.fr>
parents:
7081
diff
changeset
|
304 @DOCSTRING(num2hex) |
b2fbb393a072
Add the num2hex and hex2num functions
David Bateman <dbateman@free.fr>
parents:
7081
diff
changeset
|
305 |
b2fbb393a072
Add the num2hex and hex2num functions
David Bateman <dbateman@free.fr>
parents:
7081
diff
changeset
|
306 @DOCSTRING(hex2num) |
b2fbb393a072
Add the num2hex and hex2num functions
David Bateman <dbateman@free.fr>
parents:
7081
diff
changeset
|
307 |
6623 | 308 @DOCSTRING(str2double) |
3920 | 309 |
6623 | 310 @DOCSTRING(strjust) |
6502 | 311 |
3361 | 312 @DOCSTRING(str2num) |
3294 | 313 |
3361 | 314 @DOCSTRING(toascii) |
3294 | 315 |
3361 | 316 @DOCSTRING(tolower) |
3294 | 317 |
3361 | 318 @DOCSTRING(toupper) |
3294 | 319 |
3428 | 320 @DOCSTRING(do_string_escapes) |
321 | |
3361 | 322 @DOCSTRING(undo_string_escapes) |
3294 | 323 |
4167 | 324 @node Character Class Functions |
3294 | 325 @section Character Class Functions |
326 | |
327 Octave also provides the following character class test functions | |
328 patterned after the functions in the standard C library. They all | |
329 operate on string arrays and return matrices of zeros and ones. | |
330 Elements that are nonzero indicate that the condition was true for the | |
331 corresponding character in the string array. For example, | |
332 | |
333 @example | |
334 @group | |
335 isalpha ("!Q@@WERT^Y&") | |
336 @result{} [ 0, 1, 0, 1, 1, 1, 1, 0, 1, 0 ] | |
337 @end group | |
338 @end example | |
339 | |
3361 | 340 @DOCSTRING(isalnum) |
3294 | 341 |
3361 | 342 @DOCSTRING(isalpha) |
343 | |
344 @DOCSTRING(isascii) | |
3294 | 345 |
3361 | 346 @DOCSTRING(iscntrl) |
3294 | 347 |
3361 | 348 @DOCSTRING(isdigit) |
3294 | 349 |
3361 | 350 @DOCSTRING(isgraph) |
3294 | 351 |
6549 | 352 @DOCSTRING(isletter) |
353 | |
3361 | 354 @DOCSTRING(islower) |
3294 | 355 |
3361 | 356 @DOCSTRING(isprint) |
3294 | 357 |
3361 | 358 @DOCSTRING(ispunct) |
3294 | 359 |
3361 | 360 @DOCSTRING(isspace) |
3294 | 361 |
3361 | 362 @DOCSTRING(isupper) |
3294 | 363 |
3361 | 364 @DOCSTRING(isxdigit) |
7984
bbaa5d7d0143
Some documentation updates
David Bateman <dbateman@free.fr>
parents:
7639
diff
changeset
|
365 |
bbaa5d7d0143
Some documentation updates
David Bateman <dbateman@free.fr>
parents:
7639
diff
changeset
|
366 @DOCSTRING(isstrprop) |