Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-2722

LP:951937 - Wrong result (missing rows) with semijoin+materialization, IN subquery, InnoDB, TEMPTABLE view or derived_merge=off

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      The following query

      SELECT * FROM v
      WHERE ( a, a ) IN (
      SELECT alias2.b, alias2.a
      FROM t1 AS alias1, t1 AS alias2
      WHERE alias2.b = alias1.a
      AND ( alias1.b >= alias1.a OR alias2.b = 'z' )

      on the test data returns 6 rows if it's executed with semijoin=on and materialization=on, and 19 rows otherwise. The latter is correct.
      On a variation of the test data, where column b is nullable, the query returns no rows at all. This variation is added to the test case as ALTER TABLE followed by the same query.

      bzr version-info
      revision-id: <email address hidden>
      date: 2012-03-05 22:33:46 -0800
      build-date: 2012-03-11 05:27:06 +0400
      revno: 3455

      Also reproducible on 5.5 (revno 3316).
      Not reproducible on MySQL 5.6 (revno 3706).

      EXPLAIN with semijoin=on, materialization=on (wrong result):

      id select_type table type possible_keys key key_len ref rows filtered Extra
      1 PRIMARY <subquery2> ALL distinct_key NULL NULL NULL 19 100.00
      1 PRIMARY <derived3> ALL NULL NULL NULL NULL 19 100.00 Using where; Using join buffer (flat, BNL join)
      2 MATERIALIZED alias1 ALL a NULL NULL NULL 19 100.00 Using where
      2 MATERIALIZED alias2 ref a a 4 test.alias1.a 1 100.00 Using where
      3 DERIVED t1 ALL NULL NULL NULL NULL 19 100.00
      Warnings:
      Note 1003 select `v`.`a` AS `a`,`v`.`b` AS `b` from `test`.`v` semi join (`test`.`t1` `alias1` join `test`.`t1` `alias2`) where ((`test`.`alias2`.`a` = `test`.`alias1`.`a`) and (`test`.`alias2`.`b` = `test`.`alias1`.`a`) and (`v`.`a` = `test`.`alias1`.`a`) and ((`test`.`alias1`.`b` >= `test`.`alias1`.`a`) or ((`test`.`alias1`.`a` = 'z') and (`v`.`a` = 'z'))))

      Minimal optimizer_switch: materialization=on,semijoin=on
      Full optimizer_switch (default): index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on

      Test case (with the variation):

      --source include/have_innodb.inc
      SET SESSION optimizer_switch = 'materialization=on,semijoin=on';

      CREATE TABLE t1 (
      a VARCHAR(1),
      b VARCHAR(1) NOT NULL,
      KEY(a)
      ) ENGINE=InnoDB;
      INSERT INTO t1 VALUES
      ('j','j'),('v','v'),('c','c'),('m','m'),('d','d'),
      ('y','y'),('t','t'),('d','d'),('s','s'),('r','r'),
      ('m','m'),('b','b'),('x','x'),('g','g'),('p','p'),
      ('q','q'),('w','w'),('d','d'),('e','e');

      CREATE ALGORITHM=TEMPTABLE VIEW v AS SELECT * FROM t1;

      1. This query returns 6 rows instead of 19

      SELECT * FROM v
      WHERE ( a, a ) IN (
      SELECT alias2.b, alias2.a
      FROM t1 AS alias1, t1 AS alias2
      WHERE alias2.b = alias1.a
      AND ( alias1.b >= alias1.a OR alias2.b = 'z' )
      );

      1. End of the main part.
      2. The rest is the test case variation, where we make column b nullable

      ALTER TABLE t1 MODIFY COLUMN b VARCHAR(1);

      1. This query returns an empty set

      SELECT * FROM v
      WHERE ( a, a ) IN (
      SELECT alias2.b, alias2.a
      FROM t1 AS alias1, t1 AS alias2
      WHERE alias2.b = alias1.a
      AND ( alias1.b >= alias1.a OR alias2.b = 'z' )
      );

      1. End of test case

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            psergey Sergei Petrunia added a comment -

            Re: Wrong result (missing rows) with semijoin+materialization, IN subquery, InnoDB, TEMPTABLE view or derived_merge=off
            Before the equality substitutiuon, the WHERE clause has this form (both in maria-5.3 and mysql-5.6):

            "(((t1a.b >= t1a.a) or multiple equal('z', t1b.b, t1a.a, v.a, t1b.a)) and multiple equal(t1b.b, t1a.a, v.a, t1b.a))"

            the multiple-equal inside the OR clause is the same as the outer one, except that it also includes constant 'z'.

            After equality substitution for the inner multiple equal('z', t1b.b, t1a.a, v.a, t1b.a) , we get

            In MariaDB:
            $537 = 0x8b9f560 "((t1a.a = 'z') and (v.a = 'z'))"

            In MySQL:
            $117 = 0x8f7bea0 "((t1a.a = 'z') and (t1b.b = 'z') and (t1b.a = 'z'))"

            Show
            psergey Sergei Petrunia added a comment - Re: Wrong result (missing rows) with semijoin+materialization, IN subquery, InnoDB, TEMPTABLE view or derived_merge=off Before the equality substitutiuon, the WHERE clause has this form (both in maria-5.3 and mysql-5.6): "(((t1a.b >= t1a.a) or multiple equal('z', t1b.b, t1a.a, v.a, t1b.a)) and multiple equal(t1b.b, t1a.a, v.a, t1b.a))" the multiple-equal inside the OR clause is the same as the outer one, except that it also includes constant 'z'. After equality substitution for the inner multiple equal('z', t1b.b, t1a.a, v.a, t1b.a) , we get In MariaDB: $537 = 0x8b9f560 "((t1a.a = 'z') and (v.a = 'z'))" In MySQL: $117 = 0x8f7bea0 "((t1a.a = 'z') and (t1b.b = 'z') and (t1b.a = 'z'))"
            Hide
            psergey Sergei Petrunia added a comment -

            Re: Wrong result (missing rows) with semijoin+materialization, IN subquery, InnoDB, TEMPTABLE view or derived_merge=off
            The problem is related to fix for BUG#928048.

            Show
            psergey Sergei Petrunia added a comment - Re: Wrong result (missing rows) with semijoin+materialization, IN subquery, InnoDB, TEMPTABLE view or derived_merge=off The problem is related to fix for BUG#928048.
            Hide
            psergey Sergei Petrunia added a comment -

            Re: Wrong result (missing rows) with semijoin+materialization, IN subquery, InnoDB, TEMPTABLE view or derived_merge=off
            Consider an example:

            select * from ot
            where (ot.a, ot.b) in (select it.a, it.b
            from it
            where (it.a=111 and func1(it.c)) or (it.b=222 and func2(it.c))
            )
            and (ot.a=333 or func3(ot.c))

            convert_subq_to_sj(), followed by equality propagation, will produce this WHERE
            clause:

            01 multi-equal(ot.a, it.a) and
            02 multi-equal(ot.b, it.b) and
            03 (multi-equal(inherited(ot.a, it.a), 111) and func1(it.c) or
            04 multi-equal(inherited(ot.b, it.b), 222) and func2(it.c)
            05 ) and
            06 (multi-equal(inherited(ot.a, it.a), 333) or func3(ot.c))

            Suppose the chosen join order is
            ot, sj-materialization(it)
            or the other way around, but using materialization.

            Now, we do equality substitution:

            • in line 03, we must not generate equalities that include ot.a, because the
              other part of OR uses func1(it.c) can only be checked when doing
              materialization.
            • in line 06, we must not generate qualities that include it.a, because the
              other part of uses "func3(ot.c)", which cannot be checked when doing
              materialization.

            This shows the problem. Suppose we're doing equality substitution for an OR
            clause and its first child element is:

            multi-equal(inherited(ot.a, it.a), 111)

            is this a case like in #03, or like in #06? There is no way to tell.

            Show
            psergey Sergei Petrunia added a comment - Re: Wrong result (missing rows) with semijoin+materialization, IN subquery, InnoDB, TEMPTABLE view or derived_merge=off Consider an example: select * from ot where (ot.a, ot.b) in (select it.a, it.b from it where (it.a=111 and func1(it.c)) or (it.b=222 and func2(it.c)) ) and (ot.a=333 or func3(ot.c)) convert_subq_to_sj(), followed by equality propagation, will produce this WHERE clause: 01 multi-equal(ot.a, it.a) and 02 multi-equal(ot.b, it.b) and 03 (multi-equal(inherited(ot.a, it.a), 111) and func1(it.c) or 04 multi-equal(inherited(ot.b, it.b), 222) and func2(it.c) 05 ) and 06 (multi-equal(inherited(ot.a, it.a), 333) or func3(ot.c)) Suppose the chosen join order is ot, sj-materialization(it) or the other way around, but using materialization. Now, we do equality substitution: in line 03, we must not generate equalities that include ot.a, because the other part of OR uses func1(it.c) can only be checked when doing materialization. in line 06, we must not generate qualities that include it.a, because the other part of uses "func3(ot.c)", which cannot be checked when doing materialization. This shows the problem. Suppose we're doing equality substitution for an OR clause and its first child element is: multi-equal(inherited(ot.a, it.a), 111) is this a case like in #03, or like in #06? There is no way to tell.
            Hide
            psergey Sergei Petrunia added a comment -

            Re: Wrong result (missing rows) with semijoin+materialization, IN subquery, InnoDB, TEMPTABLE view or derived_merge=off
            Patch committed, need review.

            Show
            psergey Sergei Petrunia added a comment - Re: Wrong result (missing rows) with semijoin+materialization, IN subquery, InnoDB, TEMPTABLE view or derived_merge=off Patch committed, need review.
            Hide
            ratzpo Rasmus Johansson added a comment -

            Launchpad bug id: 951937

            Show
            ratzpo Rasmus Johansson added a comment - Launchpad bug id: 951937

              People

              • Assignee:
                psergey Sergei Petrunia
                Reporter:
                elenst Elena Stepanova
              • Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: