A capturing group acts like the grouping operator in JavaScript expressions, allowing you to use a subpattern as a single atom.
Capturing groups are numbered by the order of their opening parentheses. The first capturing group is numbered 1
, the second 2
, and so on. Named capturing groups are also capturing groups and are numbered together with other (unnamed) capturing groups. The information of the capturing group's match can be accessed through:
Note: Even in exec()
's result array, capturing groups are accessed by numbers 1
, 2
, etc., because the 0
element is the entire match. \0
is not a backreference, but a character escape for the NUL character.
Capturing groups in the regex source code correspond to their results one-to-one. If a capturing group is not matched (for example, it belongs to an unmatched alternative in a disjunction), the corresponding result is undefined
.
Capturing groups can be quantified. In this case, the match information corresponding to this group is the last match of the group.
Capturing groups can be used in lookahead and lookbehind assertions. Because lookbehind assertions match their atoms backwards, the final match corresponding to this group is the one that appears to the left end of the string. However, the indices of the match groups still correspond to their relative locations in the regex source.
/c(?=(ab))/.exec("cab");
/(?<=(a)(b))c/.exec("abc");
/(?<=([ab])+)c/.exec("abc");
Capturing groups can be nested, in which case the outer group is numbered first, then the inner group, because they are ordered by their opening parentheses. If a nested group is repeated by a quantifier, then each time the group matches, the subgroups' results are all overwritten, sometimes with undefined
.
/((a+)?(b+)?(c))*/.exec("aacbbbcac");
In the example above, the outer group is matched three times:
- Matches
"aac"
, with subgroups "aa"
, undefined
, and "c"
. - Matches
"bbbc"
, with subgroups undefined
, "bbb"
, and "c"
. - Matches
"ac"
, with subgroups "a"
, undefined
, and "c"
.
The "bbb"
result from the second match is not preserved, because the third match overwrites it with undefined
.
You can get the start and end indices of each capturing group in the input string by using the d
flag. This creates an extra indices
property on the array returned by exec()
.
You can optionally specify a name to a capturing group, which helps avoid pitfalls related to group positions and indexing. See Named capturing groups for more information.
Parentheses have other purposes in different regex syntaxes. For example, they also enclose lookahead and lookbehind assertions. Because these syntaxes all start with ?
, and ?
is a quantifier which normally cannot occur directly after (
, this does not lead to ambiguities.