Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-4201 LevelDB Storage Engine MS2
  3. MDEV-4341

LevelDB (actually optimizer): Wrong result (duplicate rows) with FROM subquery, range access

    Details

    • Type: Technical task
    • Status: Stalled
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      MySQL [test]> CREATE TABLE t1 (pk INT PRIMARY KEY, i INT, KEY(i)) ENGINE=LevelDB;
      Query OK, 0 rows affected (0.35 sec)
      
      MySQL [test]> INSERT INTO t1 VALUES (1,8),(2,8);
      Query OK, 2 rows affected (0.00 sec)
      Records: 2  Duplicates: 0  Warnings: 0
      
      MySQL [test]> 
      MySQL [test]> SELECT * FROM ( SELECT * FROM t1 ) AS alias
          -> WHERE alias.i IN ( SELECT i FROM t1 WHERE i <= 9 );
      +----+------+
      | pk | i    |
      +----+------+
      |  1 |    8 |
      |  2 |    8 |
      |  1 |    8 |
      |  2 |    8 |
      +----+------+
      4 rows in set (0.00 sec)
      
      MySQL [test]> EXPLAIN EXTENDED
          -> SELECT * FROM ( SELECT * FROM t1 ) AS alias
          -> WHERE alias.i IN ( SELECT i FROM t1 WHERE i <= 9 );
      +----+-------------+------------+-------+---------------+-------------+---------+-----------+------+----------+-------------------------------------+
      | id | select_type | table      | type  | possible_keys | key         | key_len | ref       | rows | filtered | Extra                               |
      +----+-------------+------------+-------+---------------+-------------+---------+-----------+------+----------+-------------------------------------+
      |  1 | PRIMARY     | t1         | range | i             | i           | 5       | NULL      |   10 |   100.00 | Using where; Using index; LooseScan |
      |  1 | PRIMARY     | <derived2> | ref   | <auto_key0>   | <auto_key0> | 5       | test.t1.i |   10 |   100.00 | NULL                                |
      |  2 | DERIVED     | t1         | index | NULL          | i           | 5       | NULL      | 1000 |   100.00 | Using index                         |
      +----+-------------+------------+-------+---------------+-------------+---------+-----------+------+----------+-------------------------------------+
      3 rows in set, 1 warning (0.00 sec)
      
      MySQL [test]> 
      MySQL [test]> SHOW WARNINGS;
      +-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | Level | Code | Message                                                                                                                                                                                                                                                        |
      +-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | Note  | 1003 | /* select#1 */ select `alias`.`pk` AS `pk`,`alias`.`i` AS `i` from (/* select#2 */ select `test`.`t1`.`pk` AS `pk`,`test`.`t1`.`i` AS `i` from `test`.`t1`) `alias` semi join (`test`.`t1`) where ((`alias`.`i` = `test`.`t1`.`i`) and (`test`.`t1`.`i` <= 9)) |
      +-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      1 row in set (0.00 sec)
      

      MyISAM produces the expected result:

      MySQL [test]> CREATE TABLE t1 (pk INT PRIMARY KEY, i INT, KEY(i)) ENGINE=MyISAM;
      Query OK, 0 rows affected (0.23 sec)
      
      MySQL [test]> INSERT INTO t1 VALUES (1,8),(2,8);
      Query OK, 2 rows affected (0.00 sec)
      Records: 2  Duplicates: 0  Warnings: 0
      
      MySQL [test]> 
      MySQL [test]> SELECT * FROM ( SELECT * FROM t1 ) AS alias
          -> WHERE alias.i IN ( SELECT i FROM t1 WHERE i <= 9 );
      +----+------+
      | pk | i    |
      +----+------+
      |  1 |    8 |
      |  2 |    8 |
      +----+------+
      2 rows in set (0.01 sec)
      
      MySQL [test]> EXPLAIN EXTENDED
          -> SELECT * FROM ( SELECT * FROM t1 ) AS alias
          -> WHERE alias.i IN ( SELECT i FROM t1 WHERE i <= 9 );
      +----+-------------+------------+-------+---------------+-------------+---------+-----------+------+----------+-------------------------------------+
      | id | select_type | table      | type  | possible_keys | key         | key_len | ref       | rows | filtered | Extra                               |
      +----+-------------+------------+-------+---------------+-------------+---------+-----------+------+----------+-------------------------------------+
      |  1 | PRIMARY     | t1         | index | i             | i           | 5       | NULL      |    2 |   100.00 | Using where; Using index; LooseScan |
      |  1 | PRIMARY     | <derived2> | ref   | <auto_key0>   | <auto_key0> | 5       | test.t1.i |    2 |   100.00 | NULL                                |
      |  2 | DERIVED     | t1         | ALL   | NULL          | NULL        | NULL    | NULL      |    2 |   100.00 | NULL                                |
      +----+-------------+------------+-------+---------------+-------------+---------+-----------+------+----------+-------------------------------------+
      3 rows in set, 1 warning (0.00 sec)
      
      MySQL [test]> 
      MySQL [test]> SHOW WARNINGS;
      +-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | Level | Code | Message                                                                                                                                                                                                                                                        |
      +-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | Note  | 1003 | /* select#1 */ select `alias`.`pk` AS `pk`,`alias`.`i` AS `i` from (/* select#2 */ select `test`.`t1`.`pk` AS `pk`,`test`.`t1`.`i` AS `i` from `test`.`t1`) `alias` semi join (`test`.`t1`) where ((`alias`.`i` = `test`.`t1`.`i`) and (`test`.`t1`.`i` <= 9)) |
      +-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      1 row in set (0.00 sec)
      
      revision-id: psergey@askmonty.org-20130327181232-q2i2big0pvr38uad
      revno: 4819
      branch-nick: mysql-5.6-leveldb
      

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            psergey Sergei Petrunia added a comment -

            The problem is not repeatable on MyISAM or InnoDB, because they return different statistics. If I take cost data that is returned by ha_leveldb and make ha_innobase return it, the bug is repeatable with innodb.

            Show
            psergey Sergei Petrunia added a comment - The problem is not repeatable on MyISAM or InnoDB, because they return different statistics. If I take cost data that is returned by ha_leveldb and make ha_innobase return it, the bug is repeatable with innodb.
            Hide
            psergey Sergei Petrunia added a comment -

            The problem is in JOIN::set_access_methods(), in this piece of code:

            if (!keyuse)

            { tab->type= JT_ALL; if (tableno > const_tables) full_join= true; }

            else if (tab->position->sj_strategy == SJ_OPT_LOOSE_SCAN)

            { DBUG_ASSERT(tab->keys.is_set(tab->position->loosescan_key)); tab->type= JT_ALL; // @todo is this consistent for a LooseScan table ? tab->index= tab->position->loosescan_key; }

            When it is executed for the second time (i.e. for the main query, not for the
            subquery), on the first iteration (tableno==0) we have:

            tab->quick->index=1 // correct, index `i`.
            keyuse==NULL // correct, no ref access is used
            tab->index= 0 // incorrect. I guess this is a leftover from JOIN_TABs
            // being bzero'ed after they are allocated.

            When keyuse!=NULL, this function does the assigment:

            tab->index= tab->position->loosescan_key;

            and then the execution part (sub_select() in particular) uses that information
            to execute LooseScan.

            If I force the assignment of

            tab->index= tab->position->loosescan_key;

            then the wrong result disappears. (The only remaining question is whether this
            function should be obliged to do

            if (tab->position->sj_strategy == SJ_OPT_LOOSE_SCAN && tab->quick)
            tab->index= tab->quick->index

            or is it LooseScan code in sub_select() that should use tab->quick->index
            instead of tab->index when quick select is used.

            Show
            psergey Sergei Petrunia added a comment - The problem is in JOIN::set_access_methods(), in this piece of code: if (!keyuse) { tab->type= JT_ALL; if (tableno > const_tables) full_join= true; } else if (tab->position->sj_strategy == SJ_OPT_LOOSE_SCAN) { DBUG_ASSERT(tab->keys.is_set(tab->position->loosescan_key)); tab->type= JT_ALL; // @todo is this consistent for a LooseScan table ? tab->index= tab->position->loosescan_key; } When it is executed for the second time (i.e. for the main query, not for the subquery), on the first iteration (tableno==0) we have: tab->quick->index=1 // correct, index `i`. keyuse==NULL // correct, no ref access is used tab->index= 0 // incorrect. I guess this is a leftover from JOIN_TABs // being bzero'ed after they are allocated. When keyuse!=NULL, this function does the assigment: tab->index= tab->position->loosescan_key; and then the execution part (sub_select() in particular) uses that information to execute LooseScan. If I force the assignment of tab->index= tab->position->loosescan_key; then the wrong result disappears. (The only remaining question is whether this function should be obliged to do if (tab->position->sj_strategy == SJ_OPT_LOOSE_SCAN && tab->quick) tab->index= tab->quick->index or is it LooseScan code in sub_select() that should use tab->quick->index instead of tab->index when quick select is used.

              People

              • Assignee:
                psergey Sergei Petrunia
                Reporter:
                elenst Elena Stepanova
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0 minutes
                  0m
                  Logged:
                  Time Spent - 3 hours
                  3h