annotate scripts/strings/strsplit.m @ 16411:5be43435bd5b

Improve speed and backward compatibility for strsplit() * scripts/strings/strsplit.m: Improve speed and backward compatibility. * NEWS: Modify entry for strsplit() for Octave 3.8.x.
author Ben Abbott <bpabbott@mac.com>
date Tue, 02 Apr 2013 19:36:52 -0400
parents 1de4ec2a856d
children 03a28487fa9d
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
14138
72c96de7a403 maint: update copyright notices for 2012
John W. Eaton <jwe@octave.org>
parents: 13929
diff changeset
1 ## Copyright (C) 2009-2012 Jaroslav Hajek
8877
2c8b2399247b implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff changeset
2 ##
11104
2c356a35d7f5 fix copyright notices
John W. Eaton <jwe@octave.org>
parents: 8884
diff changeset
3 ## This file is part of Octave.
2c356a35d7f5 fix copyright notices
John W. Eaton <jwe@octave.org>
parents: 8884
diff changeset
4 ##
2c356a35d7f5 fix copyright notices
John W. Eaton <jwe@octave.org>
parents: 8884
diff changeset
5 ## Octave is free software; you can redistribute it and/or modify it
8877
2c8b2399247b implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff changeset
6 ## under the terms of the GNU General Public License as published by
2c8b2399247b implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff changeset
7 ## the Free Software Foundation; either version 3 of the License, or (at
2c8b2399247b implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff changeset
8 ## your option) any later version.
2c8b2399247b implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff changeset
9 ##
11104
2c356a35d7f5 fix copyright notices
John W. Eaton <jwe@octave.org>
parents: 8884
diff changeset
10 ## Octave is distributed in the hope that it will be useful, but
8877
2c8b2399247b implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff changeset
11 ## WITHOUT ANY WARRANTY; without even the implied warranty of
2c8b2399247b implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff changeset
12 ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
2c8b2399247b implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff changeset
13 ## General Public License for more details.
2c8b2399247b implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff changeset
14 ##
2c8b2399247b implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff changeset
15 ## You should have received a copy of the GNU General Public License
11104
2c356a35d7f5 fix copyright notices
John W. Eaton <jwe@octave.org>
parents: 8884
diff changeset
16 ## along with Octave; see the file COPYING. If not, see
8877
2c8b2399247b implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff changeset
17 ## <http://www.gnu.org/licenses/>.
2c8b2399247b implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff changeset
18
2c8b2399247b implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff changeset
19 ## -*- texinfo -*-
16403
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
20 ## @deftypefn {Function File} {[@var{cstr}] =} strsplit (@var{s})
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
21 ## @deftypefnx {Function File} {[@var{cstr}] =} strsplit (@var{s}, @var{del})
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
22 ## @deftypefnx {Function File} {[@var{cstr}] =} strsplit (@var{s}, @var{del}, @var{collapsedelimiters})
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
23 ## @deftypefnx {Function File} {[@var{cstr}] =} strsplit (@dots{}, @var{name}, @var{value})
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
24 ## @deftypefnx {Function File} {[@var{cstr}, @var{matches}] =} strsplit (@dots{})
16411
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
25 ## Split the string @var{s} using the delimiters specified by @var{del}
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
26 ## and return a cell array of strings. For a single delimiter, @var{del}
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
27 ## may be a string, or a scalar cell-string. For multible delimiters,
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
28 ## @var{del} must be a cell-string array. Unless @var{collapsedelimiters} is
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
29 ## specified to be @var{false}, consecutive delimiters are collapsed into one.
13701
46e68badedb8 strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents: 12915
diff changeset
30 ##
16403
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
31 ## The second output, @var{matches}, returns the delmiters which were matched
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
32 ## in the original string. The matched delimiters are uneffected by the
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
33 ## @var{collapsedelimiters}.
13701
46e68badedb8 strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents: 12915
diff changeset
34 ##
46e68badedb8 strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents: 12915
diff changeset
35 ## Example:
13929
9cae456085c2 Grammarcheck of documentation before 3.6.0 release.
Rik <octave@nomad.inbox5.com>
parents: 13776
diff changeset
36 ##
13701
46e68badedb8 strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents: 12915
diff changeset
37 ## @example
13929
9cae456085c2 Grammarcheck of documentation before 3.6.0 release.
Rik <octave@nomad.inbox5.com>
parents: 13776
diff changeset
38 ## @group
16403
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
39 ## strsplit ("a b c")
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
40 ## @result{}
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
41 ## @{
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
42 ## [1,1] = a
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
43 ## [1,2] = b
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
44 ## [1,3] = c
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
45 ## @}
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
46 ##
13701
46e68badedb8 strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents: 12915
diff changeset
47 ## strsplit ("a,b,c", ",")
14327
4d917a6a858b doc: Use Octave coding conventions in @example blocks of docstrings.
Rik <octave@nomad.inbox5.com>
parents: 14138
diff changeset
48 ## @result{}
13701
46e68badedb8 strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents: 12915
diff changeset
49 ## @{
46e68badedb8 strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents: 12915
diff changeset
50 ## [1,1] = a
46e68badedb8 strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents: 12915
diff changeset
51 ## [1,2] = b
46e68badedb8 strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents: 12915
diff changeset
52 ## [1,3] = c
46e68badedb8 strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents: 12915
diff changeset
53 ## @}
46e68badedb8 strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents: 12915
diff changeset
54 ##
16403
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
55 ## strsplit ("a foo b,bar c", @{"\s", "foo", "bar"@})
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
56 ## @result{}
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
57 ## @{
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
58 ## [1,1] = a
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
59 ## [1,2] = b
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
60 ## [1,3] = c
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
61 ## @}
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
62 ##
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
63 ## strsplit ("a,,b, c", @{",", " "@}, false)
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
64 ## @result{}
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
65 ## @{
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
66 ## [1,1] = a
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
67 ## [1,2] =
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
68 ## [1,3] = b
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
69 ## [1,4] =
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
70 ## [1,5] = c
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
71 ## @}
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
72 ##
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
73 ## @end group
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
74 ## @end example
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
75 ##
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
76 ## Supported @var{name}/@var{value} pair arguments are;
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
77 ##
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
78 ## @itemize
16411
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
79 ## @item @var{collapsedelimiters} may take the value of @var{true} or @var{false}
16403
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
80 ## with the default being @var{false}.
16411
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
81 ## @item @var{delimitertype} may take the value of @code{legacy},
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
82 ## @code{simple} or @code{regularexpression}.
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
83 ## If @var{delimitertype} is equal to @code{legacy}, each individual
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
84 ## character of @var{del} is used to split the input.
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
85 ## If the specified delimiters are single characters, the default is
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
86 ## @var{delimitertype} is @code{legacy}. Otherwise the default
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
87 ## @var{delimitertype} is @code{simple}.
16403
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
88 ## @end itemize
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
89 ##
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
90 ## Example:
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
91 ##
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
92 ## @example
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
93 ## @group
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
94 ## strsplit ("a foo b,bar c", ",|\\s|foo|bar", "delimitertype", "regularexpression")
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
95 ## @result{}
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
96 ## @{
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
97 ## [1,1] = a
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
98 ## [1,2] = b
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
99 ## [1,3] = c
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
100 ## @}
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
101 ##
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
102 ## strsplit ("a,,b, c", "[, ]", false, "delimitertype", "regularexpression")
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
103 ## @result{}
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
104 ## @{
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
105 ## [1,1] = a
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
106 ## [1,2] =
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
107 ## [1,3] = b
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
108 ## [1,4] =
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
109 ## [1,5] = c
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
110 ## @}
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
111 ##
16411
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
112 ## strsplit ("a,,b, c", ", ", false, "delimitertype", "legacy")
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
113 ## @result{}
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
114 ## @{
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
115 ## [1,1] = a
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
116 ## [1,2] =
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
117 ## [1,3] = b
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
118 ## [1,4] =
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
119 ## [1,5] = c
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
120 ## @}
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
121 ##
16403
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
122 ## strsplit ("a,\t,b, c", @{',', '\s'@}, "delimitertype", "regularexpression")
14327
4d917a6a858b doc: Use Octave coding conventions in @example blocks of docstrings.
Rik <octave@nomad.inbox5.com>
parents: 14138
diff changeset
123 ## @result{}
13701
46e68badedb8 strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents: 12915
diff changeset
124 ## @{
46e68badedb8 strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents: 12915
diff changeset
125 ## [1,1] = a
46e68badedb8 strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents: 12915
diff changeset
126 ## [1,2] = b
16403
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
127 ## [1,3] = c
13701
46e68badedb8 strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents: 12915
diff changeset
128 ## @}
46e68badedb8 strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents: 12915
diff changeset
129 ## @end group
46e68badedb8 strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents: 12915
diff changeset
130 ## @end example
16403
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
131 ##
16411
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
132 ## @seealso{strjoin, strtok, regexp}
8877
2c8b2399247b implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff changeset
133 ## @end deftypefn
2c8b2399247b implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff changeset
134
16403
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
135 function [result, matches] = strsplit (str, del, varargin)
8884
579de77acd90 strsplit.m: style fixes
John W. Eaton <jwe@octave.org>
parents: 8883
diff changeset
136
16403
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
137 args.collapsedelimiters = true;
16411
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
138 args.delimitertype = "default";
16403
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
139
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
140 [reg, params] = parseparams (varargin);
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
141
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
142 if (numel (reg) > 1)
8877
2c8b2399247b implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff changeset
143 print_usage ();
16403
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
144 elseif (numel (reg) == 1)
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
145 if (islogical (reg{1}) || isnumeric (reg{1}))
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
146 args.collapsedelimiters = reg{1};
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
147 else
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
148 print_usage ();
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
149 endif
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
150 endif
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
151 fields = fieldnames (args);
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
152 for n = 1:2:numel(params)
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
153 if (any (strcmpi (params{n}, fields)))
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
154 args.(lower(params{n})) = params{n+1};
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
155 elseif (ischar (varargin{n}))
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
156 error ("strsplit:invalid_parameter_name",
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
157 sprintf ("strsplit: Invalid parameter name, `%s'", varargin{n}))
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
158 else
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
159 print_usage ();
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
160 endif
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
161 endfor
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
162
16411
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
163 if (strcmpi (args.delimitertype, "default"))
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
164 if (nargin == 1 || numel (del) == 1
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
165 || (nargin > 1 && (islogical (del) || isnumeric (del)))
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
166 || iscell (del) && all (cellfun (@numel, del) < 2))
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
167 ## For single character delimiters, default to "legacy"
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
168 args.delimitertype = "legacy";
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
169 else
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
170 ## For multi-character delimiters, default to "simple"
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
171 args.delimitertype = "simple";
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
172 endif
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
173 endif
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
174
16403
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
175 # Save the length of the "delimitertype" parameter
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
176 length_deltype = numel (args.delimitertype);
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
177
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
178 if (nargin == 1 || (nargin > 1 && (islogical (del) || isnumeric (del))))
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
179 if (nargin > 1)
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
180 ## Second input is the "collapsedelimiters" parameter
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
181 args.collapsedelimiters = del;
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
182 endif
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
183 ## Set proper default for the delimiter type
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
184 if (strncmpi (args.delimitertype, "simple", numel (args.delimitertype)))
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
185 del = {" ","\f","\n","\r","\t","\v"};
16411
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
186 elseif (strncmpi (args.delimitertype, "legacy", numel (args.delimitertype)))
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
187 del = " \f\n\r\t\v";
16403
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
188 else
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
189 del = "\\s";
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
190 endif
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
191 endif
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
192
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
193 if (nargin < 1)
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
194 print_usage ();
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
195 elseif (! ischar (str) || (! ischar (del) && ! iscellstr (del)))
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
196 error ("strsplit: S and DEL must be string values");
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
197 elseif (! isscalar (args.collapsedelimiters))
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
198 error ("strsplit: COLLAPSEDELIMITERS must be a scalar value");
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
199 endif
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
200
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
201 if (strncmpi (args.delimitertype, "simple", length_deltype))
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
202 if (iscellstr (del))
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
203 del = cellfun (@(x) regexp2simple (x, false), del, "uniformoutput",
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
204 false);
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
205 else
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
206 del = regexp2simple (del, false);
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
207 endif
8877
2c8b2399247b implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff changeset
208 endif
2c8b2399247b implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff changeset
209
16411
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
210 if (rows (str) > 1)
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
211 tmp = char (del(1));
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
212 str = [str, repmat(tmp,rows(str),1)];
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
213 str = reshape (str.', 1, numel (str));
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
214 str(end-numel(tmp)+1:end) = [];
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
215 endif
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
216
16403
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
217 if (isempty (str))
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
218 result = {str};
16411
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
219 elseif (strncmpi (args.delimitertype, "legacy", length_deltype))
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
220 ## Conventional splitting is preserved for its speed. Its delimiter type
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
221 ##
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
222 if (! ischar (del))
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
223 if (iscell (del) && all (cellfun (@numel, del) < 2))
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
224 del = [del{:}];
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
225 else
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
226 error ("strsplit:legacy_delimiter_must_be_char",
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
227 "%s %s", "strsplit: for DELIMITERTYPE = ""legacy"" ",
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
228 "DEL must be a string, or a cell array scalar character elements.")
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
229 endif
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
230 endif
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
231 ## Split s according to delimiter
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
232 if (isscalar (del))
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
233 ## Single separator
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
234 idx = find (str == del);
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
235 else
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
236 ## Multiple separators
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
237 idx = strchr (str, del);
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
238 endif
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
239
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
240 ## Get substring lengths.
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
241 if (isempty (idx))
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
242 strlens = length (str);
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
243 else
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
244 strlens = [idx(1)-1, diff(idx)-1, numel(str)-idx(end)];
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
245 endif
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
246 if (nargout > 1)
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
247 ## Grab the separators
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
248 matches = num2cell (str(idx)(:)).';
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
249 endif
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
250 ## Remove separators.
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
251 str(idx) = [];
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
252 if (args.collapsedelimiters)
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
253 ## Omit zero lengths.
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
254 strlens = strlens(strlens != 0);
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
255 endif
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
256
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
257 ## Convert!
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
258 result = mat2cell (str, 1, strlens);
16403
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
259 elseif (strncmpi (args.delimitertype, "regularexpression", length_deltype)
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
260 || strncmpi (args.delimitertype, "simple", length_deltype))
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
261 if (iscellstr (del))
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
262 del = sprintf ('%s|', del{:});
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
263 del(end) = [];
13701
46e68badedb8 strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents: 12915
diff changeset
264 endif
16403
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
265 [result, ~, ~, ~, matches] = regexp (str, del, "split");
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
266 if (args.collapsedelimiters)
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
267 result(cellfun (@isempty, result)) = [];
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
268 endif
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
269 if (strncmpi (args.delimitertype, "simple", length_deltype))
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
270 matches = cellfun (@(x) regexp2simple (x, true), matches,
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
271 "uniformoutput", false);
8877
2c8b2399247b implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff changeset
272 endif
16403
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
273 else
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
274 error ("strsplit:invalid_delimitertype",
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
275 sprintf ("strsplit: Invalid DELIMITERTYPE"))
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
276 endif
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
277 endfunction
8877
2c8b2399247b implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff changeset
278
16403
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
279 function str = regexp2simple (str, reverse = false)
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
280 rep = {'\', '[', ']', '{', '}', '$', '^', '(', ')', '*', '+', '.', '?', '|'};
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
281 if (reverse)
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
282 ## backslash must go last
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
283 for r = numel(rep):-1:1
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
284 str = strrep (str, [char(92), rep{r}], rep{r});
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
285 endfor
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
286 else
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
287 ## backslash must go first
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
288 for r = 1:numel(rep)
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
289 str = strrep (str, rep{r}, [char(92), rep{r}]);
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
290 endfor
8877
2c8b2399247b implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff changeset
291 endif
2c8b2399247b implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff changeset
292 endfunction
2c8b2399247b implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff changeset
293
16411
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
294 % Mimic the old strsplit()
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
295 %!assert (cellfun (@numel, strsplit (["a,b,c";"1,2 "], ",")), [1 1 2 1 4])
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
296
16403
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
297 %!shared str
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
298 %! str = "The rain in Spain stays mainly in the plain.";
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
299 % Split on all whitespace.
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
300 %!assert (strsplit (str), {"The", "rain", "in", "Spain", "stays", ...
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
301 %! "mainly", "in", "the", "plain."})
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
302 % Split on "ain".
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
303 %!assert (strsplit (str, "ain"), {"The r", " in Sp", " stays m", ...
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
304 %! "ly in the pl", "."})
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
305 % Split on " " and "ain" (treating multiple delimiters as one).
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
306 %!test
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
307 %! s = strsplit (str, '\s|ain', true, "delimitertype", "r");
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
308 %! assert (s, {"The", "r", "in", "Sp", "stays", "m", "ly", "in", "the", "pl", "."})
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
309 %!test
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
310 %! s = strsplit (str, "\\s|ain", true, "delimitertype", "r");
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
311 %! assert (s, {"The", "r", "in", "Sp", "stays", "m", "ly", "in", "the", "pl", "."})
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
312 %!test
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
313 %! [s, m] = strsplit (str, {"\\s", "ain"}, true, "delimitertype", "r");
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
314 %! assert (s, {"The", "r", "in", "Sp", "stays", "m", "ly", "in", "the", "pl", "."})
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
315 %! assert (m, {" ", "ain", " ", " ", "ain", " ", " ", "ain", " ", " ", " ", "ain"})
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
316 % Split on " " and "ain", and treat multiple delimiters separately.
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
317 %!test
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
318 %! [s, m] = strsplit (str, {" ", "ain"}, "collapsedelimiters", false);
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
319 %! assert (s, {"The", "r", "", "in", "Sp", "", "stays", "m", "ly", "in", "the", "pl", "."})
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
320 %! assert (m, {" ", "ain", " ", " ", "ain", " ", " ", "ain", " ", " ", " ", "ain"})
13701
46e68badedb8 strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents: 12915
diff changeset
321
16403
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
322 %!assert (strsplit ("road to hell"), {"road", "to", "hell"})
13701
46e68badedb8 strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents: 12915
diff changeset
323 %!assert (strsplit ("road to hell", " "), {"road", "to", "hell"})
16403
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
324 %!assert (strsplit ("road to^hell", {" ","^"}), {"road", "to", "hell"})
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
325 %!assert (strsplit ("road to--hell", {" ","-"}, true), {"road", "to", "hell"})
16411
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
326 %!assert (strsplit (["a,bc,,de"], ",", false, "delimitertype", "s"), {"a", "bc", "", "de"})
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
327 %!assert (strsplit (["a,bc,,de"], ",", false), {"a", "bc", char(ones(1,0)), "de"})
16403
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
328 %!assert (strsplit (["a,bc,de"], ",", true), {"a", "bc", "de"})
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
329 %!assert (strsplit (["a,bc,de"], {","," "}, true), {"a", "bc", "de"})
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
330 %!test
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
331 %! [s, m] = strsplit ("hello \t world", 1);
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
332 %! assert (s, {"hello", "world"});
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
333 %! assert (m, {" ", "\t", " "});
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
334
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
335 %!assert (strsplit ("road to hell", " ", "delimitertype", "r"), {"road", "to", "hell"})
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
336 %!assert (strsplit ("road to^hell", '\^| ', "delimitertype", "r"), {"road", "to", "hell"})
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
337 %!assert (strsplit ("road to^hell", "[ ^]", "delimitertype", "r"), {"road", "to", "hell"})
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
338 %!assert (strsplit ("road to--hell", "[ -]", false, "delimitertype", "r"), {"road", "", "", "to", "", "hell"})
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
339 %!assert (strsplit (["a,bc,de"], ",", "delimitertype", "r"), {"a", "bc", "de"})
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
340 %!assert (strsplit (["a,bc,,de"], ",", false, "delimitertype", "r"), {"a", "bc", "", "de"})
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
341 %!assert (strsplit (["a,bc,de"], ",", true, "delimitertype", "r"), {"a", "bc", "de"})
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
342 %!assert (strsplit (["a,bc,de"], "[, ]", true, "delimitertype", "r"), {"a", "bc", "de"})
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
343 %!assert (strsplit ("hello \t world", 1, "delimitertype", "r"), {"hello", "world"});
8877
2c8b2399247b implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff changeset
344
16411
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
345 %!assert (strsplit ("road to hell", " ", false, "delimitertype", "l"), {"road", "to", "hell"})
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
346 %!assert (strsplit ("road to^hell", " ^", false, "delimitertype", "l"), {"road", "to", "hell"})
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
347 %!assert (strsplit ("road to--hell", " -", true, "delimitertype", "l"), {"road", "to", "hell"})
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
348 %!assert (strsplit (["a,bc";",de"], ",", false, "delimitertype", "l"), {"a", "bc", char(ones(1,0)), "de "})
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
349 %!assert (strsplit (["a,bc";",de"], ",", true, "delimitertype", "l"), {"a", "bc", "de "})
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
350 %!assert (strsplit (["a,bc";",de"], ", ", true, "delimitertype", "l"), {"a", "bc", "de"})
5be43435bd5b Improve speed and backward compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 16403
diff changeset
351
13701
46e68badedb8 strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents: 12915
diff changeset
352 %% Test input validation
46e68badedb8 strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents: 12915
diff changeset
353 %!error strsplit ()
46e68badedb8 strsplit.m: Expand to accept 2-D character arrays. Improve input validation.
Rik <octave@nomad.inbox5.com>
parents: 12915
diff changeset
354 %!error strsplit ("abc", "b", true, 4)
16403
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
355 %!error <S and DEL must be string values> strsplit (123, "b")
1de4ec2a856d Matlab compatibility for strsplit()
Ben Abbott <bpabbott@mac.com>
parents: 15521
diff changeset
356 %!error <COLLAPSEDELIMITERS must be a scalar value> strsplit ("abc", "def", ones (3,3))
8877
2c8b2399247b implement strsplit; deprecate split
Jaroslav Hajek <highegg@gmail.com>
parents:
diff changeset
357