Prior to MariaDB 5.3, the index_merge
access method supported union
, sort-union
, and intersection
operations. Starting from MariaDB 5.3, the sort-intersection
operation is also supported. This allows the use of index_merge
in a broader number of cases.
This feature is disabled by default. To enable it, turn on the optimizer switch index_merge_sort_intersection
like so:
SET optimizer_switch='index_merge_sort_intersection=on'
Prior to MariaDB 5.3, the index_merge
access method had one intersection strategy called intersection
. That strategy can only be used when merged index scans produced rowid-ordered streams. In practice this means that an intersection
could only be constructed from equality (=) conditions.
For example, the following query will use intersection
:
MySQL [ontime]> EXPLAIN SELECT AVG(arrdelay) FROM ontime WHERE depdel15=1 AND OriginState ='CA'; +--+-----------+------+-----------+--------------------+--------------------+-------+----+-----+-------------------------------------------------+ |id|select_type|table |type |possible_keys |key |key_len|ref |rows |Extra | +--+-----------+------+-----------+--------------------+--------------------+-------+----+-----+-------------------------------------------------+ | 1|SIMPLE |ontime|index_merge|OriginState,DepDel15|OriginState,DepDel15|3,5 |NULL|76952|Using intersect(OriginState,DepDel15);Using where| +--+-----------+------+-----------+--------------------+--------------------+-------+----+-----+-------------------------------------------------+
but if you replace OriginState ='CA'
with OriginState IN ('CA', 'GB')
(which matches the same number of records), then intersection
is not usable anymore:
MySQL [ontime]> explain select avg(arrdelay) from ontime where depdel15=1 and OriginState IN ('CA', 'GB'); +--+-----------+------+----+--------------------+--------+-------+-----+-----+-----------+ |id|select_type|table |type|possible_keys |key |key_len|ref |rows |Extra | +--+-----------+------+----+--------------------+--------+-------+-----+-----+-----------+ | 1|SIMPLE |ontime|ref |OriginState,DepDel15|DepDel15|5 |const|36926|Using where| +--+-----------+------+----+--------------------+--------+-------+-----+-----+-----------+
The latter query would also run 5.x times slower (from 2.2 to 10.8 seconds) in our experiments.
In MariaDB 5.3, when index_merge_sort_intersection
is enabled, index_merge
intersection plans can be constructed from non-equality conditions:
MySQL [ontime]> explain select avg(arrdelay) from ontime where depdel15=1 and OriginState IN ('CA', 'GB'); +--+-----------+------+-----------+--------------------+--------------------+-------+----+-----+--------------------------------------------------------+ |id|select_type|table |type |possible_keys |key |key_len|ref |rows |Extra | +--+-----------+------+-----------+--------------------+--------------------+-------+----+-----+--------------------------------------------------------+ | 1|SIMPLE |ontime|index_merge|OriginState,DepDel15|DepDel15,OriginState|5,3 |NULL|60754|Using sort_intersect(DepDel15,OriginState); Using where | +--+-----------+------+-----------+--------------------+--------------------+-------+----+-----+--------------------------------------------------------+
In our tests, this query ran in 3.2 seconds, which is not as good as the case with two equalities, but still much better than 10.8 seconds we were getting without sort_intersect
.
The sort_intersect
strategy has higher overhead than intersect
but is able to handle a broader set of WHERE
conditions.
index_merge/sort_intersection
works best on tables with lots of records and where intersections are sufficiently large (but still small enough to make a full table scan overkill).
The benefit is expected to be bigger for io-bound loads.
© 2019 MariaDB
Licensed under the Creative Commons Attribution 3.0 Unported License and the GNU Free Documentation License.
https://mariadb.com/kb/en/index_merge_sort_intersection/