MariaDB 5.3 has an optimizer debugging patch. The patch is pushed into
lp:maria-captains/maria/5.3-optimizer-debugging
The patch is wrapped in #ifdef, but there is a #define straight in mysql_priv.h so simply compiling that tree should produce a binary with optimizer debugging enabled.
The patch adds two system variables:
@@debug_optimizer_prefer_join_prefix
@@debug_optimizer_dupsweedout_penalized
the variables are present as session/global variables, and are also settable via the server command line.
If this variable is non-NULL, it is assumed to specify a join prefix as a comma-separated list of table aliases:
set debug_optimizer_prefer_join_prefix='tbl1,tbl2,tbl3';
The optimizer will try its best to build a join plan which matches the specified join prefix. It does this by comparing join prefixes it is considering with @@debug_optimizer_prefer_join_prefix
, and multiplying cost by a million if the plan doesn't match the prefix.
As a result, you can more-or-less control the join order. For example, let's take this query:
MariaDB [test]> explain select * from ten A, ten B, ten C; +----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------+ | 1 | SIMPLE | A | ALL | NULL | NULL | NULL | NULL | 10 | | | 1 | SIMPLE | B | ALL | NULL | NULL | NULL | NULL | 10 | Using join buffer (flat, BNL join) | | 1 | SIMPLE | C | ALL | NULL | NULL | NULL | NULL | 10 | Using join buffer (flat, BNL join) | +----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------+ 3 rows in set (0.00 sec)
and request a join order of C,A,B:
MariaDB [test]> set debug_optimizer_prefer_join_prefix='C,A,B'; Query OK, 0 rows affected (0.00 sec) MariaDB [test]> explain select * from ten A, ten B, ten C; +----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------+ | 1 | SIMPLE | C | ALL | NULL | NULL | NULL | NULL | 10 | | | 1 | SIMPLE | A | ALL | NULL | NULL | NULL | NULL | 10 | Using join buffer (flat, BNL join) | | 1 | SIMPLE | B | ALL | NULL | NULL | NULL | NULL | 10 | Using join buffer (flat, BNL join) | +----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------+ 3 rows in set (0.00 sec)
We got it.
Note that this is still a best-effort approach:
It is possible to force the join order of joins plus semi-joins. This may cause a different strategy to be used:
MariaDB [test]> set debug_optimizer_prefer_join_prefix=NULL; Query OK, 0 rows affected (0.00 sec) MariaDB [test]> explain select * from ten A where a in (select B.a from ten B, ten C where C.a + A.a < 4); +----+-------------+-------+------+---------------+------+---------+------+------+----------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+------+------+----------------------------+ | 1 | PRIMARY | A | ALL | NULL | NULL | NULL | NULL | 10 | | | 1 | PRIMARY | B | ALL | NULL | NULL | NULL | NULL | 10 | Using where | | 1 | PRIMARY | C | ALL | NULL | NULL | NULL | NULL | 10 | Using where; FirstMatch(A) | +----+-------------+-------+------+---------------+------+---------+------+------+----------------------------+ 3 rows in set (0.00 sec) MariaDB [test]> set debug_optimizer_prefer_join_prefix='C,A,B'; Query OK, 0 rows affected (0.00 sec) MariaDB [test]> explain select * from ten A where a in (select B.a from ten B, ten C where C.a + A.a < 4); +----+-------------+-------+------+---------------+------+---------+------+------+-------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+------+------+-------------------------------------------------+ | 1 | PRIMARY | C | ALL | NULL | NULL | NULL | NULL | 10 | Start temporary | | 1 | PRIMARY | A | ALL | NULL | NULL | NULL | NULL | 10 | Using where; Using join buffer (flat, BNL join) | | 1 | PRIMARY | B | ALL | NULL | NULL | NULL | NULL | 10 | Using where; End temporary | +----+-------------+-------+------+---------------+------+---------+------+------+-------------------------------------------------+ 3 rows in set (0.00 sec)
Semi-join materialization is a somewhat special case, because "join prefix" is not exactly what you see in the EXPLAIN output. For semi-join materialization:
<subqueryN>
" into @@debug_optimizer_prefer_join_prefix
<subqueryN>
line. MariaDB [test]> set debug_optimizer_prefer_join_prefix='A,C,B,AA'; Query OK, 0 rows affected (0.00 sec) MariaDB [test]> explain select * from ten A, ten AA where A.a in (select B.a from ten B, ten C); +----+-------------+-------------+--------+---------------+--------------+---------+------+------+------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------------+--------+---------------+--------------+---------+------+------+------------------------------------+ | 1 | PRIMARY | A | ALL | NULL | NULL | NULL | NULL | 10 | | | 1 | PRIMARY | <subquery2> | eq_ref | distinct_key | distinct_key | 5 | func | 1 | | | 1 | PRIMARY | AA | ALL | NULL | NULL | NULL | NULL | 10 | Using join buffer (flat, BNL join) | | 2 | SUBQUERY | B | ALL | NULL | NULL | NULL | NULL | 10 | | | 2 | SUBQUERY | C | ALL | NULL | NULL | NULL | NULL | 10 | | +----+-------------+-------------+--------+---------------+--------------+---------+------+------+------------------------------------+ 5 rows in set (0.00 sec)
but we get A-B-C-AA.
There are four semi-join execution strategies:
FirstMatch
Materialization
LooseScan
DuplicateWeedout
The first three strategies have flags in @@optimizer_switch that can be used to disable them. The DuplicateWeedout
strategy does not have a flag. This was done for a reason, as that strategy is the catch-all strategy and it can handle all kinds of subqueries, in all kinds of join orders. (We're slowly moving to the point where it will be possible to run with FirstMatch
enabled and everything else disabled but we are not there yet.)
Since DuplicateWeedout
cannot be disabled, there are cases where it "gets in the way" by being chosen over the strategy you need. This is what debug_optimizer_dupsweedout_penalized
is for. if you set:
MariaDB [test]> set debug_optimizer_dupsweedout_penalized=TRUE;
...the costs of query plans that use DuplicateWeedout
will be multiplied by a millon. This doesn't mean that you will get rid of DuplicateWeedout
— due to Bug #898747 it is still possible to have DuplicateWeedout
used even if a cheaper plan exits. A partial remedy to this is to run with
MariaDB [test]> set optimizer_prune_level=0;
It is possible to use both debug_optimizer_dupsweedout_penalized
and debug_optimizer_prefer_join_prefix
at the same time. This should give you the desired strategy and join order.
© 2019 MariaDB
Licensed under the Creative Commons Attribution 3.0 Unported License and the GNU Free Documentation License.
https://mariadb.com/kb/en/mariadb-53-optimizer-debugging/