Known things about DBT-3 benchmark and its queries
A simple, one-table query.
select
l_returnflag,
l_linestatus,
sum(l_quantity) as sum_qty,
sum(l_extendedprice) as sum_base_price,
sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
avg(l_quantity) as avg_qty,
avg(l_extendedprice) as avg_price,
avg(l_discount) as avg_disc,
count(*) as count_order
from
lineitem
where
l_shipdate <= date_sub('1998-12-01', interval 79 day)
group by
l_returnflag,
l_linestatus
order by
l_returnflag,
l_linestatus;
Query plan:
+------+-------------+----------+------+---------------+------+---------+------+----------+----------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+----------+------+---------------+------+---------+------+----------+----------------------------------------------+ | 1 | SIMPLE | lineitem | ALL | i_l_shipdate | NULL | NULL | NULL | 59711977 | Using where; Using temporary; Using filesort | +------+-------------+----------+------+---------------+------+---------+------+----------+----------------------------------------------+
(l_returnflag,l_linestatus). This means, sorting doesn't matter, and temporary table is a very small heap table. plans starting with table "part" ... - 8sec on cold cache. scale=10 (scale=30 ETA 1min)
select
l_orderkey, sum(l_extendedprice*(1-l_discount)) as revenue,
o_orderdate, o_shippriority
from
customer,
orders,
lineitem
where
c_mktsegment = 'BUILDING' and c_custkey = o_custkey
and l_orderkey = o_orderkey and o_orderdate < date '1995-03-15'
and l_shipdate > date '1995-03-15'
group by l_orderkey, o_orderdate, o_shippriority
order by revenue desc, o_orderdate
limit 10;
There seems to be an improvement in mysql-5.6: http://jorgenloland.blogspot.ru/2013/02/dbt-3-q3-6-x-performance-in-mysql-5610.html
(speedup can be observed only when the query is in the form like the above (TODO: figure out where do different forms of queries come from?))
EXPLAINs (scale=10):
+------+-------------+----------+--------+---------------------------------------------------------+---------+---------+----------------------------+----------+----------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+----------+--------+---------------------------------------------------------+---------+---------+----------------------------+----------+----------------------------------------------+ | 1 | SIMPLE | orders | ALL | NULL | NULL | NULL | NULL | 15115145 | Using where; Using temporary; Using filesort | | 1 | SIMPLE | customer | eq_ref | PRIMARY | PRIMARY | 4 | dbt3sf10.orders.o_custkey | 1 | Using where | | 1 | SIMPLE | lineitem | ref | PRIMARY,i_l_shipdate,i_l_orderkey,i_l_orderkey_quantity | PRIMARY | 4 | dbt3sf10.orders.o_orderkey | 2 | Using where | +------+-------------+----------+--------+---------------------------------------------------------+---------+---------+----------------------------+----------+----------------------------------------------+
+------+-------------+----------+--------+---------------------------------------------------------+----------------+---------+----------------------------+---------+---------------------------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+----------+--------+---------------------------------------------------------+----------------+---------+----------------------------+---------+---------------------------------------------------------------------+ | 1 | SIMPLE | orders | range | PRIMARY,i_o_date_clerk | i_o_date_clerk | 4 | NULL | 7557572 | Using index condition; Using where; Using temporary; Using filesort | | 1 | SIMPLE | customer | eq_ref | PRIMARY | PRIMARY | 4 | dbt3sf10.orders.o_custkey | 1 | Using where | | 1 | SIMPLE | lineitem | ref | PRIMARY,i_l_shipdate,i_l_orderkey,i_l_orderkey_quantity | PRIMARY | 4 | dbt3sf10.orders.o_orderkey | 2 | Using where | +------+-------------+----------+--------+---------------------------------------------------------+----------------+---------+----------------------------+---------+---------------------------------------------------------------------+
With statistics on 'building': we get a plan of customer, orders, lineitem. It is 5% worse. There seems to be no other possibilities.
Nothing good so far.
watch the "c_nationkey = s_nationkey" condition. It is a "side" condition (ie it is not from the "natural" relationships between tables). It is not clear whether accounting for its selectivity will give anything)
Timour is analyzing this query.
Timour is analyzing this query.
+------+-------------+----------+--------+----------------------------------------------------------------------------------------+---------------------+---------+--------------------------------------------------+------+----------+---------------------------------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +------+-------------+----------+--------+----------------------------------------------------------------------------------------+---------------------+---------+--------------------------------------------------+------+----------+---------------------------------------------------------------------------+ | 1 | SIMPLE | nation | ALL | PRIMARY | NULL | NULL | NULL | 25 | 100.00 | Using temporary; Using filesort | | 1 | SIMPLE | supplier | ref | PRIMARY,i_s_nationkey | i_s_nationkey | 5 | dbt3.nation.n_nationkey | 4077 | 100.00 | Using index | | 1 | SIMPLE | partsupp | ref | PRIMARY,i_ps_partkey,i_ps_suppkey | i_ps_suppkey | 4 | dbt3.supplier.s_suppkey | 38 | 100.00 | Using join buffer (flat, BKA join); Key-ordered Rowid-ordered scan | | 1 | SIMPLE | part | eq_ref | PRIMARY | PRIMARY | 4 | dbt3.partsupp.ps_partkey | 1 | 100.00 | Using where; Using join buffer (incremental, BKA join); Key-ordered scan | | 1 | SIMPLE | lineitem | ref | PRIMARY,i_l_suppkey_partkey,i_l_partkey,i_l_suppkey,i_l_orderkey,i_l_orderkey_quantity | i_l_suppkey_partkey | 10 | dbt3.partsupp.ps_partkey,dbt3.supplier.s_suppkey | 3 | 100.00 | Using join buffer (incremental, BKA join); Key-ordered Rowid-ordered scan | | 1 | SIMPLE | orders | eq_ref | PRIMARY | PRIMARY | 4 | dbt3.lineitem.l_orderkey | 1 | 100.00 | Using join buffer (incremental, BKA join); Key-ordered scan | +------+-------------+----------+--------+----------------------------------------------------------------------------------------+---------------------+---------+--------------------------------------------------+------+----------+---------------------------------------------------------------------------+
Watch for "p_name like ..." condition. What if we force table part to be the 1st.
SergeyP is analyzing this query.
select c_count, count(*) as custdist from ( select c_custkey, count(o_orderkey) as c_count from customer left outer join orders on c_custkey = o_custkey and o_comment not like '%express%requests%' group by c_custkey ) as c_orders group by c_count order by custdist desc, c_count desc;
SergeyP is analyzing this query.
Timour is analyzing this query.
select sum(l_extendedprice) / 7.0 as avg_yearly from lineitem, part where p_partkey = l_partkey and p_brand = 'Brand#43' and p_container = 'WRAP DRUM' and l_quantity < (select 0.2 * avg(l_quantity) from lineitem where l_partkey = p_partkey);
+------+--------------------+----------+------+---------------------------------+---------------------+---------+---------------------+---------+----------+------------------------------------------------------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +------+--------------------+----------+------+---------------------------------+---------------------+---------+---------------------+---------+----------+------------------------------------------------------------------------------------------------+ | 1 | PRIMARY | part | ALL | PRIMARY | NULL | NULL | NULL | 6000000 | 1.39 | Using where | | 1 | PRIMARY | lineitem | ref | i_l_suppkey_partkey,i_l_partkey | i_l_suppkey_partkey | 5 | dbt3.part.p_partkey | 29 | 100.00 | Using where; Subqueries: 2; Using join buffer (flat, BKA join); Key-ordered Rowid-ordered scan | | 2 | DEPENDENT SUBQUERY | lineitem | ref | i_l_suppkey_partkey,i_l_partkey | i_l_suppkey_partkey | 5 | dbt3.part.p_partkey | 29 | 100.00 | | +------+--------------------+----------+------+---------------------------------+---------------------+---------+---------------------+---------+----------+------------------------------------------------------------------------------------------------+
explain extended
select sql_calc_found_rows
s_name, s_address
from
supplier, nation
where s_suppkey in (select ps_suppkey from partsupp
where ps_partkey in (select p_partkey from part where p_name like 'g%')
and ps_availqty >
(select 0.5 * sum(l_quantity)
from lineitem
where l_partkey = ps_partkey and l_suppkey = ps_suppkey
and l_shipdate >= date('1993-01-01') and l_shipdate < date('1993-01-01') + interval '1' year ))
and s_nationkey = n_nationkey
and n_name = 'UNITED STATES'
order by s_name
limit 10;
© 2023 MariaDB
Licensed under the Creative Commons Attribution 3.0 Unported License and the GNU Free Documentation License.
https://mariadb.com/kb/en/dbt3-benchmark-queries/