aregexec
Approximate String Match PositionsDetermine positions of approximate string matches.
aregexec(pattern, text, max.distance = 0.1, costs = NULL, ignore.case = FALSE, fixed = FALSE, useBytes = FALSE)
pattern | a non-empty character string or a character string containing a regular expression (for |
text | character vector where matches are sought. Coerced by |
max.distance | maximum distance allowed for a match. See |
costs | cost of transformations. See |
ignore.case | a logical. If |
fixed | If |
useBytes | a logical. If |
aregexec
provides a different interface to approximate string matching than agrep
(along the lines of the interfaces to exact string matching provided by regexec
and grep
).
Note that by default, agrep
performs literal matches, whereas aregexec
performs regular expression matches.
See agrep
and adist
for more information about approximate string matching and distances.
Comparisons are byte-by-byte if pattern
or any element of text
is marked as "bytes"
.
A list of the same length as text
, each element of which is either -1 if there is no match, or a sequence of integers with the starting positions of the match and all substrings corresponding to parenthesized subexpressions of pattern
, with attribute "match.length"
an integer vector giving the lengths of the matches (or -1 for no match).
regmatches
for extracting the matched substrings.
## Cf. the examples for agrep. x <- c("1 lazy", "1", "1 LAZY") aregexec("laysy", x, max.distance = 2) aregexec("(lay)(sy)", x, max.distance = 2) aregexec("(lay)(sy)", x, max.distance = 2, ignore.case = TRUE) m <- aregexec("(lay)(sy)", x, max.distance = 2) regmatches(x, m)
Copyright (©) 1999–2012 R Foundation for Statistical Computing.
Licensed under the GNU General Public License.