Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-7220

Materialization strategy is not used for REPLACE ... SELECT

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 10.0.15
    • Fix Version/s: 10.0.17
    • Component/s: Optimizer
    • Labels:
      None

      Description

      Materialization strategy is not used for REPLACE ... SELECT. Filing this based on CSC#8345.

      create table t1 (a int, b int, c int);
      insert into t1 select a,a,a from test.one_k A;
      
      create table t2 (a int, b int, c int);
      
      insert into t2 select A.a, A.a, A.a from test.one_k A;
      insert into t2 select * from t2;
      insert into t2 select * from t2;
      insert into t2 select * from t2;
      insert into t2 select * from t2;
      

      Let's make a query with a subquery that is better executed with Materialization:

      explain
      select * from t1 where (a,b) in (select max(a),b from t2 group by b);
      +------+--------------+-------------+--------+---------------+--------------+---------+-------------------+-------+-----------------+
      | id   | select_type  | table       | type   | possible_keys | key          | key_len | ref               | rows  | Extra           |
      +------+--------------+-------------+--------+---------------+--------------+---------+-------------------+-------+-----------------+
      |    1 | PRIMARY      | t1          | ALL    | NULL          | NULL         | NULL    | NULL              |  1000 | Using where     |
      |    1 | PRIMARY      | <subquery2> | eq_ref | distinct_key  | distinct_key | 8       | j12.t1.a,j12.t1.b |     1 |                 |
      |    2 | MATERIALIZED | t2          | ALL    | NULL          | NULL         | NULL    | NULL              | 16000 | Using temporary |
      +------+--------------+-------------+--------+---------------+--------------+---------+-------------------+-------+-----------------+
      

      Ok.
      Now, let's run a REPLACE ... SELECT for it:

      create table t3 as select * from t1 limit 1;
      
      replace into t3
      select * from t1 where (a,b) in (select max(a),b from t2 group by b);
      
      # Query OK, 1000 rows affected (3 min 30.18 sec)
      # Records: 1000  Duplicates: 0  Warnings: 0
      

      3 min looks like it is too long for the above EXPLAIN plan. Indeed, SHOW EXPLAIN shows:

      MariaDB [(none)]> show explain for 2;
      +------+--------------------+-------+------+---------------+------+---------+------+-------+-----------------+
      | id   | select_type        | table | type | possible_keys | key  | key_len | ref  | rows  | Extra           |
      +------+--------------------+-------+------+---------------+------+---------+------+-------+-----------------+
      |    1 | PRIMARY            | t1    | ALL  | NULL          | NULL | NULL    | NULL |  1000 | Using where     |
      |    2 | DEPENDENT SUBQUERY | t2    | ALL  | NULL          | NULL | NULL    | NULL | 15794 | Using temporary |
      +------+--------------------+-------+------+---------------+------+---------+------+-------+-----------------+
      

      It is using IN->EXISTS conversion, not materialization.

      CREATE TABLE ... SELECT is not affected:

      MariaDB [j12]> create table tmp1 as select * from t1 where (a,b) in (select max(a),b from t2 group by b);
      Query OK, 1000 rows affected (0.84 sec)
      Records: 1000  Duplicates: 0  Warnings: 0
      

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

              Hide
              psergey Sergei Petrunia added a comment -

              Elena Stepanova, could you run tests with the above commit (it is for current 10.0). The interesting test mixes would be those that use subquery materialization (need uncorrelated IN subqueries), and non-SELECT statements.

              Show
              psergey Sergei Petrunia added a comment - Elena Stepanova , could you run tests with the above commit (it is for current 10.0). The interesting test mixes would be those that use subquery materialization (need uncorrelated IN subqueries), and non-SELECT statements.
              Hide
              elenst Elena Stepanova added a comment -

              With the same test case as in the description, but instead of REPLACE, I do

              update t3 set c = 100  where (a,b) in (select max(a),b from t2 group by b);
              

              Isn't it supposed to use materialization now, too? It doesn't:

              MariaDB [test]> explain update t3 set c = 100  where (a,b) in (select max(a),b from t2 group by b);
              +------+--------------------+-------+------+---------------+------+---------+------+-------+-----------------+
              | id   | select_type        | table | type | possible_keys | key  | key_len | ref  | rows  | Extra           |
              +------+--------------------+-------+------+---------------+------+---------+------+-------+-----------------+
              |    1 | PRIMARY            | t3    | ALL  | NULL          | NULL | NULL    | NULL |  1001 | Using where     |
              |    2 | DEPENDENT SUBQUERY | t2    | ALL  | NULL          | NULL | NULL    | NULL | 15794 | Using temporary |
              +------+--------------------+-------+------+---------------+------+---------+------+-------+-----------------+
              2 rows in set (0.00 sec)
              

              and

              MariaDB [test]> show processlist;
              +----+------+-----------------+------+---------+------+----------------------+----------------------------------------------------------------------------+----------+
              | Id | User | Host            | db   | Command | Time | State                | Info                                                                       | Progress |
              +----+------+-----------------+------+---------+------+----------------------+----------------------------------------------------------------------------+----------+
              |  6 | root | localhost:48852 | test | Query   |  120 | Copying to tmp table | update t3 set c = 100  where (a,b) in (select max(a),b from t2 group by b) |    0.000 |
              |  7 | root | localhost:48875 | test | Query   |    0 | init                 | show processlist                                                           |    0.000 |
              +----+------+-----------------+------+---------+------+----------------------+----------------------------------------------------------------------------+----------+
              2 rows in set (0.00 sec)
              
              MariaDB [test]> show explain for 6;
              +------+--------------------+-------+------+---------------+------+---------+------+-------+-----------------+
              | id   | select_type        | table | type | possible_keys | key  | key_len | ref  | rows  | Extra           |
              +------+--------------------+-------+------+---------------+------+---------+------+-------+-----------------+
              |    1 | PRIMARY            | t3    | ALL  | NULL          | NULL | NULL    | NULL |  1001 | Using where     |
              |    2 | DEPENDENT SUBQUERY | t2    | ALL  | NULL          | NULL | NULL    | NULL | 15794 | Using temporary |
              +------+--------------------+-------+------+---------------+------+---------+------+-------+-----------------+
              2 rows in set, 1 warning (0.00 sec)
              
              Show
              elenst Elena Stepanova added a comment - With the same test case as in the description, but instead of REPLACE, I do update t3 set c = 100 where (a,b) in ( select max(a),b from t2 group by b); Isn't it supposed to use materialization now, too? It doesn't: MariaDB [test]> explain update t3 set c = 100 where (a,b) in ( select max(a),b from t2 group by b); +------+--------------------+-------+------+---------------+------+---------+------+-------+-----------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+--------------------+-------+------+---------------+------+---------+------+-------+-----------------+ | 1 | PRIMARY | t3 | ALL | NULL | NULL | NULL | NULL | 1001 | Using where | | 2 | DEPENDENT SUBQUERY | t2 | ALL | NULL | NULL | NULL | NULL | 15794 | Using temporary | +------+--------------------+-------+------+---------------+------+---------+------+-------+-----------------+ 2 rows in set (0.00 sec) and MariaDB [test]> show processlist; +----+------+-----------------+------+---------+------+----------------------+----------------------------------------------------------------------------+----------+ | Id | User | Host | db | Command | Time | State | Info | Progress | +----+------+-----------------+------+---------+------+----------------------+----------------------------------------------------------------------------+----------+ | 6 | root | localhost:48852 | test | Query | 120 | Copying to tmp table | update t3 set c = 100 where (a,b) in ( select max(a),b from t2 group by b) | 0.000 | | 7 | root | localhost:48875 | test | Query | 0 | init | show processlist | 0.000 | +----+------+-----------------+------+---------+------+----------------------+----------------------------------------------------------------------------+----------+ 2 rows in set (0.00 sec) MariaDB [test]> show explain for 6; +------+--------------------+-------+------+---------------+------+---------+------+-------+-----------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+--------------------+-------+------+---------------+------+---------+------+-------+-----------------+ | 1 | PRIMARY | t3 | ALL | NULL | NULL | NULL | NULL | 1001 | Using where | | 2 | DEPENDENT SUBQUERY | t2 | ALL | NULL | NULL | NULL | NULL | 15794 | Using temporary | +------+--------------------+-------+------+---------------+------+---------+------+-------+-----------------+ 2 rows in set, 1 warning (0.00 sec)
              Hide
              elenst Elena Stepanova added a comment -

              We discussed earlier that it's supposed to be looked at; re-assigning back so that it doesn't get lost.

              I ran a number of test rounds, didn't get wrong results or crashes. I didn't run performance tests since when DML is involved, it's too volatile to be run in an environment not specifically tuned for benchmarking.

              The issue above is what I got from manual testing.
              If it's not a bug or is a different bug, please push your bugfix for MDEV-7220.

              Show
              elenst Elena Stepanova added a comment - We discussed earlier that it's supposed to be looked at; re-assigning back so that it doesn't get lost. I ran a number of test rounds, didn't get wrong results or crashes. I didn't run performance tests since when DML is involved, it's too volatile to be run in an environment not specifically tuned for benchmarking. The issue above is what I got from manual testing. If it's not a bug or is a different bug, please push your bugfix for MDEV-7220 .
              Hide
              psergey Sergei Petrunia added a comment -

              Elena Stepanova, the difference between the statements is quite big under the hood. REPLACE ... SELECT has a SELECT part (which has a JOIN structure, which allows for semi-join optimizations and so forth). The UPDATE you've found belongs to a class of single-table UPDATE/DELETE statements. We have a task for enabling semi-join optimization for those, MDEV-7487. I guess we should look at non-semijoin optimizations, too.

              Show
              psergey Sergei Petrunia added a comment - Elena Stepanova , the difference between the statements is quite big under the hood. REPLACE ... SELECT has a SELECT part (which has a JOIN structure, which allows for semi-join optimizations and so forth). The UPDATE you've found belongs to a class of single-table UPDATE/DELETE statements. We have a task for enabling semi-join optimization for those, MDEV-7487 . I guess we should look at non-semijoin optimizations, too.
              Hide
              psergey Sergei Petrunia added a comment -

              Ok I will push this fix

              Show
              psergey Sergei Petrunia added a comment - Ok I will push this fix

                People

                • Assignee:
                  psergey Sergei Petrunia
                  Reporter:
                  psergey Sergei Petrunia
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  4 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved: