The goal of the subquery cache is to optimize the evaluation of correlated subqueries by storing results together with correlation parameters in a cache and avoiding re-execution of the subquery in cases where the result is already in the cache.
Starting in MariaDB 5.3.2-beta the cache is on by default. In previous versions of MariaDB, the cache was off by default. One can switch it on or off using optimizer_switch subquery_cache
like so:
SET optimizer_switch='subquery_cache=on';
The efficiency of the subquery cache is visible in 2 statistical variables:
The session variables tmp_table_size and max_heap_table_size influence the size of in-memory temporary tables in the table used for caching. It cannot grow more than the minimum of the above variables values (see the Implementation section for details).
Your usage of the cache is visible in EXTENDED EXPLAIN
output (warnings) as "<expr_cache><//list of parameters//>(//cached expression//)"
. For example:
MariaDB [test]> EXPLAIN EXTENDED SELECT * FROM t1 WHERE a IN (SELECT b FROM t2); +----+--------------------+-------+------+---------------+------+---------+------+------+----------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------+---------------+------+---------+------+------+----------+-------------+ | 1 | PRIMARY | t1 | ALL | NULL | NULL | NULL | NULL | 2 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | t2 | ALL | NULL | NULL | NULL | NULL | 2 | 100.00 | Using where | +----+--------------------+-------+------+---------------+------+---------+------+------+----------+-------------+ 2 rows in set, 1 warning (0.00 sec) MariaDB [test]> SHOW WARNINGS; +-------+------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Level | Code | Message | +-------+------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Note | 1003 | SELECT `test`.`t1`.`a` AS `a` from `test`.`t1` WHERE <expr_cache><`test`.`t1`.`a`>(<in_optimizer>(`test`.`t1`.`a`,<exists>(SELECT 1 FROM `test`.`t2` WHERE (<cache>(`test`.`t1`.`a`) = `test`.`t2`.`b`)))) | +-------+------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec)
In the example above the presence of "<expr_cache><`test`.`t1`.`a`>(...)"
is how you know you are using the subquery cache.
Every subquery cache creates a temporary table where the results and all parameters are stored. It has a unique index over all parameters. First the cache is created in a MEMORY table (if doing this is impossible the cache becomes disabled for that expression). When the table grows up to the minimum of tmp_table_size
and max_heap_table_size
, the hit rate will be checked:
hit rate = hit / (hit + miss)
Here are some examples that show the performance impact of the subquery cache (these tests were made on a 2.53 GHz Intel Core 2 Duo MacBook Pro with dbt-3 scale 1 data set).
example | cache on | cache off | gain | hit | miss | hit rate |
---|---|---|---|---|---|---|
1 | 1.01sec | 1 hour 31 min 43.33sec | 5445x | 149975 | 25 | 99.98% |
2 | 0.21sec | 1.41sec | 6.71x | 6285 | 220 | 96.6% |
3 | 2.54sec | 2.55sec | 1.00044x | 151 | 461 | 24.67% |
4 | 1.87sec | 1.95sec | 0.96x | 0 | 23026 | 0% |
Dataset from DBT-3 benchmark, a query to find customers with balance near top in their nation:
select count(*) from customer where c_acctbal > 0.8 * (select max(c_acctbal) from customer C where C.c_nationkey=customer.c_nationkey group by c_nationkey);
DBT-3 benchmark, Query #17
select sum(l_extendedprice) / 7.0 as avg_yearly from lineitem, part where p_partkey = l_partkey and p_brand = 'Brand#42' and p_container = 'JUMBO BAG' and l_quantity < (select 0.2 * avg(l_quantity) from lineitem where l_partkey = p_partkey);
DBT-3 benchmark, Query #2
select s_acctbal, s_name, n_name, p_partkey, p_mfgr, s_address, s_phone, s_comment from part, supplier, partsupp, nation, region where p_partkey = ps_partkey and s_suppkey = ps_suppkey and p_size = 33 and p_type like '%STEEL' and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'MIDDLE EAST' and ps_supplycost = ( select min(ps_supplycost) from partsupp, supplier, nation, region where p_partkey = ps_partkey and s_suppkey = ps_suppkey and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'MIDDLE EAST' ) order by s_acctbal desc, n_name, s_name, p_partkey;
DBT-3 benchmark, Query #20
select s_name, s_address from supplier, nation where s_suppkey in ( select distinct (ps_suppkey) from partsupp, part where ps_partkey=p_partkey and p_name like 'indian%' and ps_availqty > ( select 0.5 * sum(l_quantity) from lineitem where l_partkey = ps_partkey and l_suppkey = ps_suppkey and l_shipdate >= '1995-01-01' and l_shipdate < date_ADD('1995-01-01',interval 1 year) ) ) and s_nationkey = n_nationkey and n_name = 'JAPAN' order by s_name;
© 2019 MariaDB
Licensed under the Creative Commons Attribution 3.0 Unported License and the GNU Free Documentation License.
https://mariadb.com/kb/en/subquery-cache/