annotate liboctave/regexp.cc @ 14024:fc9f204faea0

refactor regexp (bug #34440) * liboctave/regexp.h, liboctave/regexp.cc: New files. Provide classes and functions for regular expressions. Adapted from src/DLD-FUNCTIONS/regexp.cc. * regex-match.h, regex-match.cc: Delete * liboctave/Makefile.am (INCS, LIBOCTAVE_CXX_SOURCES): Update. * variables.cc (name_matches_any_pattern): Use new regexp class. * symtab.h (symbol_table::regexp_global_variables, symbol_table::do_clear_variable_regexp, symbol_table::do_regexp): Likewise. * DLD-FUNCTIONS/regexp.cc (parse_options): New function. (octregexp, octcellregexp, octregexprep): Extract matching code for use in new regexp class. Use new regexp class to provide required functionality.
author John W. Eaton <jwe@octave.org>
date Sun, 11 Dec 2011 22:19:57 -0500
parents liboctave/regex-match.cc@12df7854fa7c
children 9867be070ee1
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
7779
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
1 /*
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
2
14024
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
3 Copyright (C) 2011 John W. Eaton
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
4 Copyright (C) 2005-2011 David Bateman
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
5 Copyright (C) 2002-2005 Paul Kienzle
7779
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
6
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
7 This file is part of Octave.
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
8
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
9 Octave is free software; you can redistribute it and/or modify it
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
10 under the terms of the GNU General Public License as published by the
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
11 Free Software Foundation; either version 3 of the License, or (at your
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
12 option) any later version.
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
13
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
14 Octave is distributed in the hope that it will be useful, but WITHOUT
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
15 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
16 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
17 for more details.
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
18
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
19 You should have received a copy of the GNU General Public License
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
20 along with Octave; see the file COPYING. If not, see
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
21 <http://www.gnu.org/licenses/>.
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
22
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
23 */
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
24
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
25 #ifdef HAVE_CONFIG_H
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
26 #include <config.h>
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
27 #endif
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
28
14024
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
29 #include <list>
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
30 #include <sstream>
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
31 #include <string>
7779
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
32 #include <vector>
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
33
14024
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
34 #include <pcre.h>
7779
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
35
14024
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
36 #include "Matrix.h"
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
37 #include "base-list.h"
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
38 #include "lo-error.h"
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
39 #include "oct-locbuf.h"
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
40 #include "quit.h"
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
41 #include "regexp.h"
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
42 #include "str-vec.h"
7779
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
43
14024
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
44 // Define the maximum number of retries for a pattern that possibly
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
45 // results in an infinite recursion.
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
46 #define PCRE_MATCHLIMIT_MAX 10
7779
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
47
14024
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
48 // FIXME -- should this be configurable?
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
49 #define MAXLOOKBEHIND 10
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
50
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
51 static bool lookbehind_warned = false;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
52
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
53 // FIXME -- don't bother collecting and composing return values the user
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
54 // doesn't want.
7779
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
55
11586
12df7854fa7c strip trailing whitespace from source files
John W. Eaton <jwe@octave.org>
parents: 11570
diff changeset
56 void
14024
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
57 regexp::free (void)
11586
12df7854fa7c strip trailing whitespace from source files
John W. Eaton <jwe@octave.org>
parents: 11570
diff changeset
58 {
14024
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
59 if (data)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
60 pcre_free (static_cast<pcre *> (data));
7779
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
61 }
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
62
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
63 void
14024
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
64 regexp::compile_internal (void)
7779
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
65 {
14024
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
66 // If we had a previously compiled pattern, release it.
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
67 free ();
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
68
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
69 size_t max_length = MAXLOOKBEHIND;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
70
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
71 size_t pos = 0;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
72 size_t new_pos;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
73 int inames = 0;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
74 std::ostringstream buf;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
75
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
76 while ((new_pos = pattern.find ("(?", pos)) != std::string::npos)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
77 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
78 if (pattern.at (new_pos + 2) == '<'
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
79 && !(pattern.at (new_pos + 3) == '='
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
80 || pattern.at (new_pos + 3) == '!'))
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
81 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
82 // The syntax of named tokens in pcre is "(?P<name>...)" while
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
83 // we need a syntax "(?<name>...)", so fix that here. Also an
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
84 // expression like
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
85 // "(?<first>\w+)\s+(?<last>\w+)|(?<last>\w+),\s+(?<first>\w+)"
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
86 // should be perfectly legal, while pcre does not allow the same
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
87 // named token name on both sides of the alternative. Also fix
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
88 // that here by replacing name tokens by dummy names, and dealing
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
89 // with the dummy names later.
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
90
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
91 size_t tmp_pos = pattern.find_first_of ('>', new_pos);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
92
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
93 if (tmp_pos == std::string::npos)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
94 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
95 (*current_liboctave_error_handler)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
96 ("regexp: syntax error in pattern");
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
97 return;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
98 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
99
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
100 std::string tmp_name =
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
101 pattern.substr (new_pos+3, tmp_pos-new_pos-3);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
102
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
103 bool found = false;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
104
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
105 for (int i = 0; i < nnames; i++)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
106 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
107 if (named_pats(i) == tmp_name)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
108 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
109 named_idx.resize (dim_vector (inames+1, 1));
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
110 named_idx(inames) = i;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
111 found = true;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
112 break;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
113 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
114 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
115
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
116 if (! found)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
117 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
118 named_idx.resize (dim_vector (inames+1, 1));
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
119 named_idx(inames) = nnames;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
120 named_pats.append (tmp_name);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
121 nnames++;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
122 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
123
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
124 if (new_pos - pos > 0)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
125 buf << pattern.substr (pos, new_pos-pos);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
126 if (inames < 10)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
127 buf << "(?P<n00" << inames++;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
128 else if (inames < 100)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
129 buf << "(?P<n0" << inames++;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
130 else
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
131 buf << "(?P<n" << inames++;
7779
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
132
14024
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
133 pos = tmp_pos;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
134 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
135 else if (pattern.at (new_pos + 2) == '<')
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
136 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
137 // Find lookbehind operators of arbitrary length (ie like
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
138 // "(?<=[a-z]*)") and replace with a maximum length operator
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
139 // as PCRE can not yet handle arbitrary length lookahead
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
140 // operators. Use the string length as the maximum length to
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
141 // avoid issues.
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
142
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
143 int brackets = 1;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
144 size_t tmp_pos1 = new_pos + 2;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
145 size_t tmp_pos2 = tmp_pos1;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
146
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
147 while (tmp_pos1 <= pattern.length () && brackets > 0)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
148 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
149 char ch = pattern.at (tmp_pos1);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
150
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
151 if (ch == '(')
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
152 brackets++;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
153 else if (ch == ')')
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
154 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
155 if (brackets > 1)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
156 tmp_pos2 = tmp_pos1;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
157
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
158 brackets--;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
159 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
160
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
161 tmp_pos1++;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
162 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
163
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
164 if (brackets != 0)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
165 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
166 buf << pattern.substr (pos, new_pos - pos) << "(?";
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
167 pos = new_pos + 2;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
168 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
169 else
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
170 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
171 size_t tmp_pos3 = pattern.find_first_of ("*+", tmp_pos2);
7779
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
172
14024
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
173 if (tmp_pos3 != std::string::npos && tmp_pos3 < tmp_pos1)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
174 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
175 if (!lookbehind_warned)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
176 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
177 lookbehind_warned = true;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
178 (*current_liboctave_warning_handler)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
179 ("%s: arbitrary length lookbehind patterns are only supported up to length %d",
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
180 who.c_str (), MAXLOOKBEHIND);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
181 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
182
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
183 buf << pattern.substr (pos, new_pos - pos) << "(";
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
184
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
185 size_t i;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
186
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
187 if (pattern.at (tmp_pos3) == '*')
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
188 i = 0;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
189 else
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
190 i = 1;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
191
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
192 for (; i < max_length + 1; i++)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
193 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
194 buf << pattern.substr (new_pos, tmp_pos3 - new_pos)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
195 << "{" << i << "}";
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
196 buf << pattern.substr (tmp_pos3 + 1,
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
197 tmp_pos1 - tmp_pos3 - 1);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
198 if (i != max_length)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
199 buf << "|";
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
200 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
201 buf << ")";
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
202 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
203 else
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
204 buf << pattern.substr (pos, tmp_pos1 - pos);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
205
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
206 pos = tmp_pos1;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
207 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
208 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
209 else
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
210 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
211 buf << pattern.substr (pos, new_pos - pos) << "(?";
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
212 pos = new_pos + 2;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
213 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
214
7784
d3a7882fa0b3 style fixes
John W. Eaton <jwe@octave.org>
parents: 7779
diff changeset
215 }
11586
12df7854fa7c strip trailing whitespace from source files
John W. Eaton <jwe@octave.org>
parents: 11570
diff changeset
216
14024
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
217 buf << pattern.substr (pos);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
218
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
219 const char *err;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
220 int erroffset;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
221 std::string buf_str = buf.str ();
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
222
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
223 int pcre_options
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
224 = ((options.case_insensitive () ? PCRE_CASELESS : 0)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
225 | (options.dotexceptnewline () ? 0 : PCRE_DOTALL)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
226 | (options.lineanchors () ? PCRE_MULTILINE : 0)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
227 | (options.freespacing () ? PCRE_EXTENDED : 0));
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
228
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
229 data = pcre_compile (buf_str.c_str (), pcre_options, &err, &erroffset, 0);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
230
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
231 if (! data)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
232 (*current_liboctave_error_handler)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
233 ("%s: %s at position %d of expression", who.c_str (),
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
234 err, erroffset);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
235 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
236
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
237 regexp::match_data
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
238 regexp::match (const std::string& buffer)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
239 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
240 regexp::match_data retval;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
241
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
242 std::list<regexp::match_element> lst;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
243
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
244 int subpatterns;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
245 int namecount;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
246 int nameentrysize;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
247 char *nametable;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
248 size_t idx = 0;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
249
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
250 pcre *re = static_cast <pcre *> (data);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
251
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
252 pcre_fullinfo (re, 0, PCRE_INFO_CAPTURECOUNT, &subpatterns);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
253 pcre_fullinfo (re, 0, PCRE_INFO_NAMECOUNT, &namecount);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
254 pcre_fullinfo (re, 0, PCRE_INFO_NAMEENTRYSIZE, &nameentrysize);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
255 pcre_fullinfo (re, 0, PCRE_INFO_NAMETABLE, &nametable);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
256
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
257 OCTAVE_LOCAL_BUFFER (int, ovector, (subpatterns+1)*3);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
258 OCTAVE_LOCAL_BUFFER (int, nidx, namecount);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
259
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
260 for (int i = 0; i < namecount; i++)
7779
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
261 {
14024
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
262 // Index of subpattern in first two bytes MSB first of name.
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
263 // Extract index.
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
264 nidx[i] = (static_cast<int> (nametable[i*nameentrysize])) << 8
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
265 | static_cast<int> (nametable[i*nameentrysize+1]);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
266 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
267
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
268 while (true)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
269 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
270 OCTAVE_QUIT;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
271
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
272 int matches = pcre_exec (re, 0, buffer.c_str (),
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
273 buffer.length (), idx,
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
274 (idx ? PCRE_NOTBOL : 0),
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
275 ovector, (subpatterns+1)*3);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
276
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
277 if (matches == PCRE_ERROR_MATCHLIMIT)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
278 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
279 // Try harder; start with default value for MATCH_LIMIT
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
280 // and increase it.
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
281 (*current_liboctave_warning_handler)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
282 ("your pattern caused PCRE to hit its MATCH_LIMIT; trying harder now, but this will be slow");
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
283
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
284 pcre_extra pe;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
285
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
286 pcre_config (PCRE_CONFIG_MATCH_LIMIT,
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
287 static_cast <void *> (&pe.match_limit));
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
288
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
289 pe.flags = PCRE_EXTRA_MATCH_LIMIT;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
290
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
291 int i = 0;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
292 while (matches == PCRE_ERROR_MATCHLIMIT
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
293 && i++ < PCRE_MATCHLIMIT_MAX)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
294 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
295 OCTAVE_QUIT;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
296
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
297 pe.match_limit *= 10;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
298 matches = pcre_exec (re, &pe, buffer.c_str (),
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
299 buffer.length (), idx,
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
300 (idx ? PCRE_NOTBOL : 0),
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
301 ovector, (subpatterns+1)*3);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
302 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
303 }
7779
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
304
14024
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
305 if (matches < 0 && matches != PCRE_ERROR_NOMATCH)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
306 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
307 (*current_liboctave_error_handler)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
308 ("%s: internal error calling pcre_exec; error code from pcre_exec is %i",
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
309 who.c_str (), matches);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
310 return retval;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
311 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
312 else if (matches == PCRE_ERROR_NOMATCH)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
313 break;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
314 else if (ovector[1] <= ovector[0])
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
315 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
316 // Zero sized match. Skip to next char.
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
317 idx = ovector[0] + 1;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
318 if (idx < buffer.length ())
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
319 continue;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
320 else
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
321 break;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
322 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
323 else
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
324 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
325 int pos_match = 0;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
326 Matrix token_extents (matches-1, 2);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
327
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
328 for (int i = 1; i < matches; i++)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
329 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
330 if (ovector[2*i] >= 0 && ovector[2*i+1] > 0
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
331 && (i == 1 || ovector[2*i] != ovector[2*i-2]
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
332 || ovector[2*i-1] != ovector[2*i+1])
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
333 && ovector[2*i] >= 0 && ovector[2*i+1] > 0)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
334 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
335 token_extents(pos_match,0) = double (ovector[2*i]+1);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
336 token_extents(pos_match++,1) = double (ovector[2*i+1]);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
337 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
338 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
339
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
340 token_extents.resize (pos_match, 2);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
341
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
342 double start = double (ovector[0]+1);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
343 double end = double (ovector[1]);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
344
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
345 const char **listptr;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
346 int status = pcre_get_substring_list (buffer.c_str (), ovector,
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
347 matches, &listptr);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
348
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
349 if (status == PCRE_ERROR_NOMEMORY)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
350 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
351 (*current_liboctave_error_handler)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
352 ("%s: cannot allocate memory in pcre_get_substring_list",
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
353 who.c_str ());
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
354 return retval;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
355 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
356
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
357 string_vector tokens (pos_match);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
358 string_vector named_tokens (nnames);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
359 int pos_offset = 0;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
360 pos_match = 0;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
361
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
362 for (int i = 1; i < matches; i++)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
363 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
364 if (ovector[2*i] >= 0 && ovector[2*i+1] > 0)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
365 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
366 if (i == 1 || ovector[2*i] != ovector[2*i-2]
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
367 || ovector[2*i-1] != ovector[2*i+1])
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
368 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
369 if (namecount > 0)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
370 named_tokens(named_idx(i-pos_offset-1)) =
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
371 std::string (*(listptr+nidx[i-pos_offset-1]));
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
372
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
373 tokens(pos_match++) = std::string (*(listptr+i));
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
374 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
375 else
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
376 pos_offset++;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
377 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
378 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
379
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
380 std::string match_string = std::string (*listptr);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
381
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
382 pcre_free_substring_list (listptr);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
383
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
384 regexp::match_element new_elem (named_tokens, tokens, match_string,
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
385 token_extents, start, end);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
386 lst.push_back (new_elem);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
387 idx = ovector[1];
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
388
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
389 if (options.once () || idx >= buffer.length ())
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
390 break;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
391 }
7779
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
392 }
14024
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
393
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
394 retval = regexp::match_data (lst, named_pats);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
395
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
396 return retval;
7779
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
397 }
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
398
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
399 bool
14024
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
400 regexp::is_match (const std::string& buffer)
7779
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
401 {
14024
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
402 regexp::match_data rx_lst = match (buffer);
7779
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
403
14024
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
404 regexp::match_data::const_iterator p = rx_lst.begin ();
7779
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
405
14024
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
406 std::string match_string = p->match_string ();
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
407
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
408 return ! match_string.empty ();
7779
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
409 }
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
410
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
411 Array<bool>
14024
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
412 regexp::is_match (const string_vector& buffer)
7779
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
413 {
14024
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
414 octave_idx_type len = buffer.length ();
7779
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
415
14024
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
416 Array<bool> retval (len, 1);
7779
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
417
14024
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
418 for (octave_idx_type i = 0; i < buffer.length (); i++)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
419 retval(i) = is_match (buffer(i));
7779
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
420
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
421 return retval;
791231dac333 Add regexp matching to Fwho and Fclear
David Bateman <dbateman@free.fr>
parents:
diff changeset
422 }
14024
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
423
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
424 std::string
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
425 regexp::replace (const std::string& buffer, const std::string& replacement)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
426 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
427 std::string retval;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
428
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
429 // Identify replacement tokens; build a vector of group numbers in
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
430 // the replacement string so that we can quickly calculate the size
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
431 // of the replacement.
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
432
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
433 int tokens = 0;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
434 for (size_t i=1; i < replacement.size (); i++)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
435 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
436 if (replacement[i-1]=='$' && isdigit (replacement[i]))
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
437 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
438 tokens++;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
439 i++;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
440 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
441 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
442 std::vector<int> token (tokens);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
443
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
444 int kk = 0;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
445 for (size_t i = 1; i < replacement.size (); i++)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
446 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
447 if (replacement[i-1]=='$' && isdigit (replacement[i]))
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
448 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
449 token[kk++] = replacement[i]-'0';
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
450 i++;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
451 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
452 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
453
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
454 regexp::match_data rx_lst = match (buffer);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
455
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
456 size_t sz = rx_lst.size ();
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
457
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
458 if (sz == 0)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
459 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
460 retval = buffer;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
461 return retval;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
462 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
463
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
464 std::string rep;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
465
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
466 if (tokens > 0)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
467 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
468 // Determine replacement length
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
469 const size_t replen = replacement.size () - 2*tokens;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
470 int delta = 0;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
471 regexp::match_data::const_iterator p = rx_lst.begin ();
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
472 for (size_t i = 0; i < sz; i++)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
473 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
474 OCTAVE_QUIT;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
475
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
476 double start = p->start ();
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
477 double end = p->end ();
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
478
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
479 const Matrix pairs (p->token_extents ());
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
480 size_t pairlen = 0;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
481 for (int j = 0; j < tokens; j++)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
482 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
483 if (token[j] == 0)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
484 pairlen += static_cast<size_t> (end - start) + 1;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
485 else if (token[j] <= pairs.rows ())
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
486 pairlen += static_cast<size_t> (pairs(token[j]-1,1)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
487 - pairs(token[j]-1,0)) + 1;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
488 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
489 delta += (static_cast<int> (replen + pairlen)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
490 - static_cast<int> (end - start + 1));
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
491 p++;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
492 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
493
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
494 // Build replacement string
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
495 rep.reserve (buffer.size () + delta);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
496 size_t from = 0;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
497 p = rx_lst.begin ();
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
498 for (size_t i = 0; i < sz; i++)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
499 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
500 OCTAVE_QUIT;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
501
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
502 double start = p->start ();
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
503 double end = p->end ();
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
504
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
505 const Matrix pairs (p->token_extents ());
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
506 rep.append (&buffer[from], static_cast<size_t> (start - 1) - from);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
507 from = static_cast<size_t> (end - 1) + 1;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
508
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
509 for (size_t j = 1; j < replacement.size (); j++)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
510 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
511 if (replacement[j-1]=='$' && isdigit (replacement[j]))
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
512 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
513 int k = replacement[j]-'0';
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
514 if (k == 0)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
515 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
516 // replace with entire match
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
517 rep.append (&buffer[static_cast<size_t> (end - 1)],
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
518 static_cast<size_t> (end - start) + 1);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
519 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
520 else if (k <= pairs.rows ())
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
521 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
522 // replace with group capture
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
523 rep.append (&buffer[static_cast<size_t> (pairs(k-1,0)-1)],
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
524 static_cast<size_t> (pairs(k-1,1)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
525 - pairs(k-1,0)) + 1);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
526 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
527 else
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
528 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
529 // replace with nothing
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
530 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
531 j++;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
532 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
533 else
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
534 rep.append (1, replacement[j-1]);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
535
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
536 if (j+1 == replacement.size ())
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
537 rep.append (1, replacement[j]);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
538 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
539 p++;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
540 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
541 rep.append (&buffer[from], buffer.size () - from);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
542 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
543 else
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
544 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
545 // Determine replacement length
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
546 const size_t replen = replacement.size ();
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
547 int delta = 0;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
548 regexp::match_data::const_iterator p = rx_lst.begin ();
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
549 for (size_t i = 0; i < sz; i++)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
550 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
551 OCTAVE_QUIT;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
552 delta += static_cast<int> (replen)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
553 - static_cast<int> (p->end () - p->start () + 1);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
554 p++;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
555 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
556
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
557 // Build replacement string
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
558 rep.reserve (buffer.size () + delta);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
559 size_t from = 0;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
560 p = rx_lst.begin ();
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
561 for (size_t i = 0; i < sz; i++)
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
562 {
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
563 OCTAVE_QUIT;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
564 rep.append (&buffer[from],
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
565 static_cast<size_t> (p->start () - 1) - from);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
566 from = static_cast<size_t> (p->end () - 1) + 1;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
567 rep.append (replacement);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
568 p++;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
569 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
570 rep.append (&buffer[from], buffer.size () - from);
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
571 }
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
572
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
573 retval = rep;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
574 return retval;
fc9f204faea0 refactor regexp (bug #34440)
John W. Eaton <jwe@octave.org>
parents: 11586
diff changeset
575 }