Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-761

LP:1010395 - Optimizer pruning chooses very inefficient join plan when all joined tables are similar

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 5.1.67, 5.2.14, 5.3.12, 5.5.36, 10.0.9
    • Fix Version/s: 10.0, 5.5
    • Component/s: None

      Description

      The following test case shows that if join optimization pruning heuristics
      is ON (the default behavior in all MariaDB/MySQL) version, the optimizer
      misses the optimal query plan and chooses a very inefficient query plan.

      CREATE TABLE t1 (a INT, KEY(a));
      INSERT INTO t1 VALUES (7),(5),(1),(204),(224),(9),(5),(0),(3);
      INSERT INTO t1 SELECT al1.* FROM t1 al1, t1 al2;
      
      -- default prune level is 1
      set optimizer_prune_level=1;
      
      EXPLAIN SELECT MAX(al1.a) FROM t1 AS al4, t1 AS al3, t1 AS al2, t1 AS al1 WHERE al4.a = al3.a;
      +----+-------------+-------+-------+---------------+------+---------+-------------+------+--------------------------------+
      | id | select_type | table | type  | possible_keys | key  | key_len | ref         | rows | Extra                          |
      +----+-------------+-------+-------+---------------+------+---------+-------------+------+--------------------------------+
      |  1 | SIMPLE      | al4   | index | a             | a    | 5       | NULL        |   90 | Using index                    |
      |  1 | SIMPLE      | al3   | ref   | a             | a    | 5       | md321.al4.a |   10 | Using where; Using index       |
      |  1 | SIMPLE      | al2   | index | NULL          | a    | 5       | NULL        |   90 | Using index; Using join buffer |
      |  1 | SIMPLE      | al1   | index | NULL          | a    | 5       | NULL        |   90 | Using index; Using join buffer |
      +----+-------------+-------+-------+---------------+------+---------+-------------+------+--------------------------------+
      SHOW SESSION STATUS LIKE 'Last_query_cost';  
      +-----------------+----------------+
      | Variable_name   | Value          |
      +-----------------+----------------+
      | Last_query_cost | 1531029.266998 |
      +-----------------+----------------+
      

      Above, if we order the tables in the FROM clause according to their optimal order, the optimizer
      finds the optimal plan.

      Below, if the tables are in reverse order, then the optimizer misses the optimal order due to pruning:

      EXPLAIN SELECT MAX(al1.a) FROM t1 AS al1, t1 AS al2, t1 AS al3, t1 AS al4 WHERE al4.a = al3.a;
      +----+-------------+-------+-------+---------------+------+---------+-------------+------+--------------------------------+
      | id | select_type | table | type  | possible_keys | key  | key_len | ref         | rows | Extra                          |
      +----+-------------+-------+-------+---------------+------+---------+-------------+------+--------------------------------+
      |  1 | SIMPLE      | al1   | index | NULL          | a    | 5       | NULL        |   90 | Using index                    |
      |  1 | SIMPLE      | al2   | index | NULL          | a    | 5       | NULL        |   90 | Using index; Using join buffer |
      |  1 | SIMPLE      | al3   | index | a             | a    | 5       | NULL        |   90 | Using index; Using join buffer |
      |  1 | SIMPLE      | al4   | ref   | a             | a    | 5       | md321.al3.a |   10 | Using where; Using index       |
      +----+-------------+-------+-------+---------------+------+---------+-------------+------+--------------------------------+
      SHOW SESSION STATUS LIKE 'Last_query_cost';
      +-----------------+----------------+
      | Variable_name   | Value          |
      +-----------------+----------------+
      | Last_query_cost | 2420964.599961 |
      +-----------------+----------------+
      

      As shown below, if we switch off the pruning heuristics, the join optimizer finds the optimal
      plan in all cases.

      
      -- switch off pruning heuristics
      set optimizer_prune_level=0;
      
      EXPLAIN SELECT MAX(al1.a) FROM t1 AS al4, t1 AS al3, t1 AS al2, t1 AS al1 WHERE al4.a = al3.a;
      +----+-------------+-------+-------+---------------+------+---------+-------------+------+--------------------------------+
      | id | select_type | table | type  | possible_keys | key  | key_len | ref         | rows | Extra                          |
      +----+-------------+-------+-------+---------------+------+---------+-------------+------+--------------------------------+
      |  1 | SIMPLE      | al4   | index | a             | a    | 5       | NULL        |   90 | Using index                    |
      |  1 | SIMPLE      | al3   | ref   | a             | a    | 5       | md321.al4.a |   10 | Using where; Using index       |
      |  1 | SIMPLE      | al2   | index | NULL          | a    | 5       | NULL        |   90 | Using index; Using join buffer |
      |  1 | SIMPLE      | al1   | index | NULL          | a    | 5       | NULL        |   90 | Using index; Using join buffer |
      +----+-------------+-------+-------+---------------+------+---------+-------------+------+--------------------------------+
      SHOW SESSION STATUS LIKE 'Last_query_cost';  
      +-----------------+----------------+
      | Variable_name   | Value          |
      +-----------------+----------------+
      | Last_query_cost | 1531029.266998 |
      +-----------------+----------------+
      
      EXPLAIN SELECT MAX(al1.a) FROM t1 AS al1, t1 AS al2, t1 AS al3, t1 AS al4 WHERE al4.a = al3.a;
      +----+-------------+-------+-------+---------------+------+---------+-------------+------+--------------------------------+
      | id | select_type | table | type  | possible_keys | key  | key_len | ref         | rows | Extra                          |
      +----+-------------+-------+-------+---------------+------+---------+-------------+------+--------------------------------+
      |  1 | SIMPLE      | al3   | index | a             | a    | 5       | NULL        |   90 | Using index                    |
      |  1 | SIMPLE      | al4   | ref   | a             | a    | 5       | md321.al3.a |   10 | Using where; Using index       |
      |  1 | SIMPLE      | al2   | index | NULL          | a    | 5       | NULL        |   90 | Using index; Using join buffer |
      |  1 | SIMPLE      | al1   | index | NULL          | a    | 5       | NULL        |   90 | Using index; Using join buffer |
      +----+-------------+-------+-------+---------------+------+---------+-------------+------+--------------------------------+
      SHOW SESSION STATUS LIKE 'Last_query_cost';
      +-----------------+----------------+
      | Variable_name   | Value          |
      +-----------------+----------------+
      | Last_query_cost | 1531029.266998 |
      +-----------------+----------------+
      

      Notice that since al3 and al4 are exactly the same, the order <al3, al4> is the same as <al4, al3>.

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            timour Timour Katchaounov added a comment - - edited

            Re: Optimizer pruning chooses very inefficient join plan when all joined tables are similar
            Test case:

            CREATE TABLE t1 (a INT, KEY(a));
            INSERT INTO t1 VALUES (7),(5),(1),(204),(224),(9),(5),(0),(3);
            
            INSERT INTO t1 SELECT al1.* FROM t1 al1, t1 al2;
            
            # default prune level is 1
            set optimizer_prune_level=1;
            
            EXPLAIN
            SELECT MAX(al1.a) FROM t1 AS al4, t1 AS al3, t1 AS al2, t1 AS al1 WHERE al4.a = al3.a;
            SHOW SESSION STATUS LIKE 'Last_query_cost';  
            
            EXPLAIN
            SELECT MAX(al1.a) FROM t1 AS al1, t1 AS al2, t1 AS al3, t1 AS al4 WHERE al4.a = al3.a;
            SHOW SESSION STATUS LIKE 'Last_query_cost';
            
            # switch off pruning heuristics
            set optimizer_prune_level=0;
            
            EXPLAIN
            SELECT MAX(al1.a) FROM t1 AS al4, t1 AS al3, t1 AS al2, t1 AS al1 WHERE al4.a = al3.a;
            SHOW SESSION STATUS LIKE 'Last_query_cost';  
            
            EXPLAIN
            SELECT MAX(al1.a) FROM t1 AS al1, t1 AS al2, t1 AS al3, t1 AS al4 WHERE al4.a = al3.a;
            SHOW SESSION STATUS LIKE 'Last_query_cost';
            
            Show
            timour Timour Katchaounov added a comment - - edited Re: Optimizer pruning chooses very inefficient join plan when all joined tables are similar Test case: CREATE TABLE t1 (a INT, KEY(a)); INSERT INTO t1 VALUES (7),(5),(1),(204),(224),(9),(5),(0),(3); INSERT INTO t1 SELECT al1.* FROM t1 al1, t1 al2; # default prune level is 1 set optimizer_prune_level=1; EXPLAIN SELECT MAX(al1.a) FROM t1 AS al4, t1 AS al3, t1 AS al2, t1 AS al1 WHERE al4.a = al3.a; SHOW SESSION STATUS LIKE 'Last_query_cost'; EXPLAIN SELECT MAX(al1.a) FROM t1 AS al1, t1 AS al2, t1 AS al3, t1 AS al4 WHERE al4.a = al3.a; SHOW SESSION STATUS LIKE 'Last_query_cost'; # switch off pruning heuristics set optimizer_prune_level=0; EXPLAIN SELECT MAX(al1.a) FROM t1 AS al4, t1 AS al3, t1 AS al2, t1 AS al1 WHERE al4.a = al3.a; SHOW SESSION STATUS LIKE 'Last_query_cost'; EXPLAIN SELECT MAX(al1.a) FROM t1 AS al1, t1 AS al2, t1 AS al3, t1 AS al4 WHERE al4.a = al3.a; SHOW SESSION STATUS LIKE 'Last_query_cost';
            Hide
            timour Timour Katchaounov added a comment -

            Re: Optimizer pruning chooses very inefficient join plan when all joined tables are similar
            The bug is reproducible in any MariaDB/MySQL version since at least MariaDB 5.2.
            This is a legacy bug that has existed since at least 2005, therefore I set its importance
            to "medium".

            Show
            timour Timour Katchaounov added a comment - Re: Optimizer pruning chooses very inefficient join plan when all joined tables are similar The bug is reproducible in any MariaDB/MySQL version since at least MariaDB 5.2. This is a legacy bug that has existed since at least 2005, therefore I set its importance to "medium".
            Hide
            timour Timour Katchaounov added a comment -

            Re: Optimizer pruning chooses very inefficient join plan when all joined tables are similar
            According to Igor on IRC:
            1. when pruning we should take into account the cardinality of the partial join, not only its cost. this will require to change one condition in the code.
            2. jorgen changed initial sorting to improve plans when pruning set to default (in MySQL)

            Show
            timour Timour Katchaounov added a comment - Re: Optimizer pruning chooses very inefficient join plan when all joined tables are similar According to Igor on IRC: 1. when pruning we should take into account the cardinality of the partial join, not only its cost. this will require to change one condition in the code. 2. jorgen changed initial sorting to improve plans when pruning set to default (in MySQL)
            Hide
            ratzpo Rasmus Johansson added a comment -

            Launchpad bug id: 1010395

            Show
            ratzpo Rasmus Johansson added a comment - Launchpad bug id: 1010395
            Hide
            elenst Elena Stepanova added a comment -

            Also reproducible on MySQL 5.1-5.7.

            Show
            elenst Elena Stepanova added a comment - Also reproducible on MySQL 5.1-5.7.

              People

              • Assignee:
                Unassigned
                Reporter:
                timour Timour Katchaounov
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated: