Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-2940

LP:913030 - Optimizer chooses a suboptimal excution plan for Q18 from DBT-3

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      When working with DBT-3 database (scale factor 10) created for MyISAM the optimizer of 5.3.4 chooses a suboptimal execution plan if all optimizer flags are set by default:

      MariaDB [dbt3x10_myisam]> explain
      -> select sql_calc_found_rows
      -> c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity)
      -> from customer, orders, lineitem
      -> where o_orderkey in (select l_orderkey from lineitem
      -> group by l_orderkey having sum(l_quantity) > 300)
      -> and c_custkey = o_custkey and o_orderkey = l_orderkey
      -> group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice
      -> order by o_totalprice desc, o_orderdate
      -> limit 10;
      -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

      id select_type table type possible_keys key key_len ref rows Extra

      -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

      1 PRIMARY orders ALL PRIMARY,i_o_custkey NULL NULL NULL 15000000 Using where; Using temporary; Using filesort
      1 PRIMARY customer eq_ref PRIMARY PRIMARY 4 dbt3x10_myisam.orders.o_custkey 1  
      1 PRIMARY <subquery2> eq_ref distinct_key distinct_key 4 dbt3x10_myisam.orders.o_orderkey 1  
      1 PRIMARY lineitem ref PRIMARY,i_l_orderkey,i_l_orderkey_quantity i_l_orderkey_quantity 4 dbt3x10_myisam.orders.o_orderkey 4 Using index
      2 MATERIALIZED lineitem index NULL i_l_orderkey_quantity 13 NULL 59986052 Using index

      -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      5 rows in set (0.00 sec)

      MariaDB [dbt3x10_myisam]> select sql_calc_found_rows
      -> c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity)
      -> from customer, orders, lineitem
      -> where o_orderkey in (select l_orderkey from lineitem
      -> group by l_orderkey having sum(l_quantity) > 300)
      -> and c_custkey = o_custkey and o_orderkey = l_orderkey
      -> group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice
      -> order by o_totalprice desc, o_orderdate
      -> limit 10;
      -----------------------------------------------------------------------------+

      c_name c_custkey o_orderkey o_orderdate o_totalprice sum(l_quantity)

      -----------------------------------------------------------------------------+

      Customer#001287812 1287812 42290181 1997-11-26 558289.17 318
      Customer#000644812 644812 2745894 1996-07-04 557664.53 304
      Customer#001172513 1172513 36667107 1997-06-06 550142.18 322
      Customer#000399481 399481 43906817 1995-04-06 549431.65 312
      Customer#000571654 571654 21213895 1992-01-03 549380.08 327
      Customer#000667882 667882 2199712 1996-09-30 542154.01 327
      Customer#001492954 1492954 30332516 1996-03-10 541181.8 310
      Customer#001471966 1471966 1263015 1997-02-02 540476.8 320
      Customer#001082018 1082018 31018979 1995-12-06 537993.05 304
      Customer#001114039 1114039 30417318 1995-10-31 536420.39 305

      -----------------------------------------------------------------------------+
      10 rows in set (11 min 22.22 sec)

      If the 'semijoin' flag of the optimizer switch is set to 'off' the optimizer chooses a better execution plan:

      MariaDB [dbt3x10_myisam]> set optimizer_switch='semijoin=off';
      Query OK, 0 rows affected (0.00 sec)

      MariaDB [dbt3x10_myisam]> explain
      -> select sql_calc_found_rows
      -> c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity)
      -> from customer, orders, lineitem
      -> where o_orderkey in (select l_orderkey from lineitem
      -> group by l_orderkey having sum(l_quantity) > 300)
      -> and c_custkey = o_custkey and o_orderkey = l_orderkey
      -> group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice
      -> order by o_totalprice desc, o_orderdate
      -> limit 10;
      ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

      id select_type table type possible_keys key key_len ref rows Extra

      ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

      1 PRIMARY orders ALL PRIMARY,i_o_custkey NULL NULL NULL 15000000 Using where; Using temporary; Using filesort
      1 PRIMARY customer eq_ref PRIMARY PRIMARY 4 dbt3x10_myisam.orders.o_custkey 1  
      1 PRIMARY lineitem ref PRIMARY,i_l_orderkey,i_l_orderkey_quantity i_l_orderkey_quantity 4 dbt3x10_myisam.orders.o_orderkey 4 Using index
      2 MATERIALIZED lineitem index NULL i_l_orderkey_quantity 13 NULL 59986052 Using index

      ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      4 rows in set (0.00 sec)

      MariaDB [dbt3x10_myisam]> select sql_calc_found_rows
      -> c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity)
      -> from customer, orders, lineitem
      -> where o_orderkey in (select l_orderkey from lineitem
      -> group by l_orderkey having sum(l_quantity) > 300)
      -> and c_custkey = o_custkey and o_orderkey = l_orderkey
      -> group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice
      -> order by o_totalprice desc, o_orderdate
      -> limit 10;
      -----------------------------------------------------------------------------+

      c_name c_custkey o_orderkey o_orderdate o_totalprice sum(l_quantity)

      -----------------------------------------------------------------------------+

      Customer#001287812 1287812 42290181 1997-11-26 558289.17 318
      Customer#000644812 644812 2745894 1996-07-04 557664.53 304
      Customer#001172513 1172513 36667107 1997-06-06 550142.18 322
      Customer#000399481 399481 43906817 1995-04-06 549431.65 312
      Customer#000571654 571654 21213895 1992-01-03 549380.08 327
      Customer#000667882 667882 2199712 1996-09-30 542154.01 327
      Customer#001492954 1492954 30332516 1996-03-10 541181.8 310
      Customer#001471966 1471966 1263015 1997-02-02 540476.8 320
      Customer#001082018 1082018 31018979 1995-12-06 537993.05 304
      Customer#001114039 1114039 30417318 1995-10-31 536420.39 305

      -----------------------------------------------------------------------------+
      10 rows in set (2 min 31.82 sec)

      The second plan is better because the materialized table is joined right after the first table orders.

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            psergey Sergei Petrunia added a comment -

            Re: Optimizer chooses a suboptimal excution plan for Q18 from DBT-3
            The first attempt was this patch:

            http://lists.askmonty.org/pipermail/commits/2012-January/002893.html

            BUG#913030: Optimizer chooses a suboptimal excution plan for Q18 from DBT-3

            • Added SJ_MATERIALIZATION_LOOKUP_MATCH_RATIO=0.75, made the optimizer assume
              that this is the probability that a lookup in semi-join materialized table will
              produce a match.

            .. decided not to push it.

            Show
            psergey Sergei Petrunia added a comment - Re: Optimizer chooses a suboptimal excution plan for Q18 from DBT-3 The first attempt was this patch: http://lists.askmonty.org/pipermail/commits/2012-January/002893.html BUG#913030: Optimizer chooses a suboptimal excution plan for Q18 from DBT-3 Added SJ_MATERIALIZATION_LOOKUP_MATCH_RATIO=0.75, made the optimizer assume that this is the probability that a lookup in semi-join materialized table will produce a match. .. decided not to push it.
            Hide
            psergey Sergei Petrunia added a comment -

            Re: Optimizer chooses a suboptimal excution plan for Q18 from DBT-3
            Commited a patch, requested testing.

            Show
            psergey Sergei Petrunia added a comment - Re: Optimizer chooses a suboptimal excution plan for Q18 from DBT-3 Commited a patch, requested testing.
            Hide
            elenst Elena Stepanova added a comment -

            Re: Optimizer chooses a suboptimal excution plan for Q18 from DBT-3
            Fix released in 5.3.6

            Show
            elenst Elena Stepanova added a comment - Re: Optimizer chooses a suboptimal excution plan for Q18 from DBT-3 Fix released in 5.3.6
            Hide
            ratzpo Rasmus Johansson added a comment -

            Launchpad bug id: 913030

            Show
            ratzpo Rasmus Johansson added a comment - Launchpad bug id: 913030

              People

              • Assignee:
                psergey Sergei Petrunia
                Reporter:
                igor Igor Babaev
              • Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: