Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6402

Optimizer doesn't choose best execution plan when composite key is used.

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 10.0.11
    • Fix Version/s: 10.1.1
    • Component/s: Optimizer
    • Environment:
      Linux matt001 2.6.18-308.el5 #1 SMP Tue Feb 21 20:06:06 EST 2012 x86_64 x86_64 x86_64 GNU/Linux

      Description

      MariaDB optimizer doesn't choose best execution plan when they use composite key.

      There's composite key with two columns (pk1 + fd5).
      The the query which have (pk1=? and fd5>?) where condition and ORDER BY fd5 clause
      generate plan using only "pk1" column.
      It also happen in MariaDB 5.5.24.

      See below test case.
      I can't upload sample data of table, becaus it's too big and it's also real service data. I think you can generate test data with index cardinality and table status.

      Test case ------------------------------------------------------------------------------

      MariaDB [test]> select version();
      +---------------------+
      | version()           |
      +---------------------+
      | 10.0.11-MariaDB-log |
      +---------------------+
      1 row in set (0.00 sec)
      
      MariaDB [test]> show variables like 'optimizer_switch'\G
      *************************** 1. row ***************************
      Variable_name: optimizer_switch
              Value: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=off
      1 row in set (0.00 sec)
      
      MariaDB [test]> show create table tb_test\G
      *************************** 1. row ***************************
             Table: tb_test
      Create Table: CREATE TABLE `tb_test` (
        `pk1` int(11) NOT NULL,
        `pk2` int(11) NOT NULL,
        `fd1` int(11) NOT NULL,
        `fd2` bigint(20) NOT NULL DEFAULT '0',
        `fd3` bigint(20) NOT NULL DEFAULT '0',
        `fd4` datetime NOT NULL,
        `fd5` bigint(20) DEFAULT NULL,
        `fd6` varchar(64) DEFAULT NULL,
        `fd7` text,
        `fd8` varchar(64) DEFAULT NULL,
        `fd9` tinyint(1) NOT NULL DEFAULT '1',
        `fd10` bigint(20) NOT NULL DEFAULT '0',
        `fd11` tinyint(1) DEFAULT NULL,
        `fd12` tinyint(1) DEFAULT NULL,
        PRIMARY KEY (`pk1`,`pk2`),
        UNIQUE KEY `ux_pk1_fd5` (`pk1`,`fd5`)
      ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
      /*!50100 PARTITION BY KEY (pk1)
      PARTITIONS 5 */
      1 row in set (0.00 sec)
      
      MariaDB [test]> alter table tb_test engine=innodb;
      Query OK, 0 rows affected (7 min 3.85 sec)
      Records: 0  Duplicates: 0  Warnings: 0
      
      MariaDB [test]> show table status like 'tb_test'\G
      *************************** 1. row ***************************
                 Name: tb_test
               Engine: InnoDB
              Version: 10
           Row_format: Compact
                 Rows: 8366199
       Avg_row_length: 96
          Data_length: 806895616
      Max_data_length: 0
         Index_length: 195002368
            Data_free: 0
       Auto_increment: NULL
          Create_time: NULL
          Update_time: NULL
           Check_time: NULL
            Collation: utf8mb4_general_ci
             Checksum: NULL
       Create_options: partitioned
              Comment:
      1 row in set (0.02 sec)
      
      MariaDB [test]> show index from tb_test;
      +---------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
      | Table   | Non_unique | Key_name   | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
      +---------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
      | tb_test |          0 | PRIMARY    |            1 | pk1         | A         |     4183099 |     NULL | NULL   |      | BTREE      |         |               |
      | tb_test |          0 | PRIMARY    |            2 | pk2         | A         |     8366199 |     NULL | NULL   |      | BTREE      |         |               |
      | tb_test |          0 | ux_pk1_fd5 |            1 | pk1         | A         |     8366199 |     NULL | NULL   |      | BTREE      |         |               |
      | tb_test |          0 | ux_pk1_fd5 |            2 | fd5         | A         |     8366199 |     NULL | NULL   | YES  | BTREE      |         |               |
      +---------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
      4 rows in set (0.00 sec)
      
      MariaDB [test]> EXPLAIN SELECT * FROM tb_test USE INDEX(ux_pk1_fd5) WHERE pk1=8287001 AND fd5<91866952691281442 ORDER BY fd5 DESC LIMIT 201,1;
      +------+-------------+---------+-------+---------------+------------+---------+------+-------+-------------+
      | id   | select_type | table   | type  | possible_keys | key        | key_len | ref  | rows  | Extra       |
      +------+-------------+---------+-------+---------------+------------+---------+------+-------+-------------+
      |    1 | SIMPLE      | tb_test | range | ux_pk1_fd5    | ux_pk1_fd5 | 13      | NULL | 14590 | Using where |
      +------+-------------+---------+-------+---------------+------------+---------+------+-------+-------------+
      1 row in set (0.01 sec)
      
      MariaDB [test]> EXPLAIN SELECT * FROM tb_test WHERE pk1=8287001 AND fd5<91866952691281442 ORDER BY fd5 DESC LIMIT 201,1;
      +------+-------------+---------+------+--------------------+------------+---------+-------+-------+-------------+
      | id   | select_type | table   | type | possible_keys      | key        | key_len | ref   | rows  | Extra       |
      +------+-------------+---------+------+--------------------+------------+---------+-------+-------+-------------+
      |    1 | SIMPLE      | tb_test | ref  | PRIMARY,ux_pk1_fd5 | ux_pk1_fd5 | 4       | const | 65108 | Using where |
      +------+-------------+---------+------+--------------------+------------+---------+-------+-------+-------------+
      1 row in set (0.01 sec)
      
      MariaDB [test]> show status like 'Handler_read_prev';
      +-------------------+-------+
      | Variable_name     | Value |
      +-------------------+-------+
      | Handler_read_prev | 0     |
      +-------------------+-------+
      1 row in set (0.02 sec)
      
      MariaDB [test]> SELECT * FROM tb_test WHERE pk1=8287001 AND fd5<91866952691281442 ORDER BY fd5 DESC LIMIT 201,1;
      ...
      1 row in set (0.27 sec)
      
      MariaDB [test]> show status like 'Handler_read_prev';
      +-------------------+-------+
      | Variable_name     | Value |
      +-------------------+-------+
      | Handler_read_prev | 28201 |
      +-------------------+-------+
      1 row in set (0.00 sec)
      
      MariaDB [test]> SELECT * FROM tb_test USE INDEX(ux_pk1_fd5) WHERE pk1=8287001 AND fd5<91866952691281442 ORDER BY fd5 DESC LIMIT 201,1;
      ...
      1 row in set (0.01 sec)
      
      MariaDB [test]> show status like 'Handler_read_prev';
      +-------------------+-------+
      | Variable_name     | Value |
      +-------------------+-------+
      | Handler_read_prev | 28402 |
      +-------------------+-------+
      1 row in set (0.00 sec)
      

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            psergey Sergei Petrunia added a comment -

            Test dataset fill script, mdev6402-dataset1.sql

            Show
            psergey Sergei Petrunia added a comment - Test dataset fill script, mdev6402-dataset1.sql
            Hide
            psergey Sergei Petrunia added a comment -

            Observations about cost calculations:

            #rows is very close for the PRIMARY key and for ux_pk1_fd5.
            But the cost of PK access is much smaller. The difference comes from
            ha_innobase::read_time() which uses different cost formulas for PK and
            non-PK (PK costs are much smaller).
            The difference in costs looks unrealistically high.

            Then, test_if_skip_sort_order() doesn't care about costs. It changes to using
            ux_pk1_fd5 based on the fact that ref access would scan the same range there.

            Show
            psergey Sergei Petrunia added a comment - Observations about cost calculations: #rows is very close for the PRIMARY key and for ux_pk1_fd5. But the cost of PK access is much smaller. The difference comes from ha_innobase::read_time() which uses different cost formulas for PK and non-PK (PK costs are much smaller). The difference in costs looks unrealistically high. Then, test_if_skip_sort_order() doesn't care about costs. It changes to using ux_pk1_fd5 based on the fact that ref access would scan the same range there.
            Hide
            psergey Sergei Petrunia added a comment - - edited

            Possible solutions

            == CHECK-RANGE-ALSO ==
            Having switched to ref(const) access, check if the used index has a range
            access also. We know for sure taht range will scan a smaller set of rows.
            (Q: does this mean that range will be cheaper? Is scanning fewer rows with
            index_next() faster than scanning a bit more rows with index_next_same()?)

            == REF-IS-THE-FIRST-STEP ==
            (This makes sense after MDEV-6384). So we've used "equal ranges are scanned"
            as justification to pick a ref(const) access. Now, continue to consider all other ways
            to read rows in the required ordering. Make a cost-based decision between the available options. (Q: how to account for condition selectivity and LIMIT? correlation between
            WHERE clause and the used access method is a big problem when both are highly selective)

            Show
            psergey Sergei Petrunia added a comment - - edited Possible solutions == CHECK-RANGE-ALSO == Having switched to ref(const) access, check if the used index has a range access also. We know for sure taht range will scan a smaller set of rows. (Q: does this mean that range will be cheaper? Is scanning fewer rows with index_next() faster than scanning a bit more rows with index_next_same()?) == REF-IS-THE-FIRST-STEP == (This makes sense after MDEV-6384 ). So we've used "equal ranges are scanned" as justification to pick a ref(const) access. Now, continue to consider all other ways to read rows in the required ordering. Make a cost-based decision between the available options. (Q: how to account for condition selectivity and LIMIT? correlation between WHERE clause and the used access method is a big problem when both are highly selective)
            Hide
            psergey Sergei Petrunia added a comment -

            Re-tried on 10.1 with fixes for MDEV-6384 and MDEV-6657 (https://github.com/MariaDB/server/tree/bb-10.1-mdev6657-r2). The issue is still repeatable.

            Show
            psergey Sergei Petrunia added a comment - Re-tried on 10.1 with fixes for MDEV-6384 and MDEV-6657 ( https://github.com/MariaDB/server/tree/bb-10.1-mdev6657-r2 ). The issue is still repeatable.
            Hide
            psergey Sergei Petrunia added a comment -

            Pushed a patch here: https://github.com/MariaDB/server/tree/bb-10.1-mdev6402

            Elena Stepanova, I need a testing pass for this tree.

            Show
            psergey Sergei Petrunia added a comment - Pushed a patch here: https://github.com/MariaDB/server/tree/bb-10.1-mdev6402 Elena Stepanova , I need a testing pass for this tree.

              People

              • Assignee:
                psergey Sergei Petrunia
                Reporter:
                Matt74 Seunguck Lee
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: