REGEXP_INSTR was introduced in MariaDB 10.0.5.
REGEXP_INSTR(subject, pattern)
Returns the position of the first occurrence of the regular expression pattern
in the string subject
, or 0 if pattern was not found.
The positions start with 1 and are measured in characters (i.e. not in bytes), which is important for multi-byte character sets. You can cast a multi-byte character set to BINARY to get offsets in bytes.
The function follows the case sensitivity rules of the effective collation. Matching is performed case insensitively for case insensitive collations, and case sensitively for case sensitive collations and for binary data.
The collation case sensitivity can be overwritten using the (?i) and (?-i) PCRE flags.
MariaDB 10.0.5 switched to the PCRE regular expression library for enhanced regular expression performance, and REGEXP_INSTR was introduced as part of this enhancement.
SELECT REGEXP_INSTR('abc','b'); -> 2 SELECT REGEXP_INSTR('abc','x'); -> 0 SELECT REGEXP_INSTR('BJÖRN','N'); -> 5
Casting a multi-byte character set as BINARY to get offsets in bytes:
SELECT REGEXP_INSTR(BINARY 'BJÖRN','N') AS cast_utf8_to_binary; -> 6
Case sensitivity:
SELECT REGEXP_INSTR('ABC','b'); -> 2 SELECT REGEXP_INSTR('ABC' COLLATE utf8_bin,'b'); -> 0 SELECT REGEXP_INSTR(BINARY'ABC','b'); -> 0 SELECT REGEXP_INSTR('ABC','(?-i)b'); -> 0 SELECT REGEXP_INSTR('ABC' COLLATE utf8_bin,'(?i)b'); -> 2
© 2019 MariaDB
Licensed under the Creative Commons Attribution 3.0 Unported License and the GNU Free Documentation License.
https://mariadb.com/kb/en/regexp_instr/