Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-3073

LP:912513 - Wrong result (missing rows) with join_cache_hashed+materialization+semijoin=on, join_cache_level > 2, JOIN, IN subquery

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      The following query

      SELECT alias1.* FROM
      t1 AS alias1 INNER JOIN t1 AS alias2
      ON alias2.a = alias1.b
      WHERE alias1.b IN (
      SELECT a FROM t1, t2
      )

      on the test data returns a single row with join_cache_level>2, and 7 rows with join_cache_level<=2. The latter is the correct result.

      bzr version-info
      revision-id: <email address hidden>
      date: 2012-01-02 20:06:36 -0800
      build-date: 2012-01-05 23:33:41 +0400
      revno: 3376
      branch-nick: maria-5.3

      Reproducible on 5.3.2 and 5.3.3 releases, current 5.3, 5.5.

      EXPLAIN with join_cache_level=3 (wrong result)

      id select_type table type possible_keys key key_len ref rows filtered Extra
      1 PRIMARY alias1 ALL NULL NULL NULL NULL 7 100.00
      1 PRIMARY <subquery2> eq_ref distinct_key distinct_key 5 func 1 100.00
      1 PRIMARY alias2 hash_index a #hash#a:a 5:5 test.t1.a 7 28.57 Using where; Using join buffer (flat, BNLH join)
      2 MATERIALIZED t1 index a a 5 NULL 7 100.00 Using where; Using index
      2 MATERIALIZED t2 ALL NULL NULL NULL NULL 7 100.00
      Warnings:
      Note 1003 select `test`.`alias1`.`a` AS `a`,`test`.`alias1`.`b` AS `b` from `test`.`t1` `alias1` semi join (`test`.`t1` join `test`.`t2`) join `test`.`t1` `alias2` where (`test`.`alias2`.`a` = `test`.`alias1`.`b`)
      SELECT alias1.* FROM
      t1 AS alias1 INNER JOIN t1 AS alias2
      ON alias2.a = alias1.b
      WHERE alias1.b IN (
      SELECT a FROM t1, t2
      )

      EXPLAIN with join_cache_level=2 (correct result):

      id select_type table type possible_keys key key_len ref rows filtered Extra
      1 PRIMARY alias1 ALL NULL NULL NULL NULL 7 100.00 Using where
      1 PRIMARY <subquery2> eq_ref distinct_key distinct_key 5 func 1 100.00
      1 PRIMARY alias2 ref a a 5 test.alias1.b 2 100.00 Using index
      2 MATERIALIZED t1 index a a 5 NULL 7 100.00 Using index
      2 MATERIALIZED t2 ALL NULL NULL NULL NULL 7 100.00
      Warnings:
      Note 1003 select `test`.`alias1`.`a` AS `a`,`test`.`alias1`.`b` AS `b` from `test`.`t1` `alias1` semi join (`test`.`t1` join `test`.`t2`) join `test`.`t1` `alias2` where (`test`.`alias2`.`a` = `test`.`alias1`.`b`)
      SELECT alias1.* FROM
      t1 AS alias1 INNER JOIN t1 AS alias2
      ON alias2.a = alias1.b
      WHERE alias1.b IN (
      SELECT a FROM t1, t2
      )

      Minimal optimizer_switch: semijoin=on,materialization=on,join_cache_hashed=on
      Full optimizer_switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on

      Test case:

      SET optimizer_switch = 'semijoin=on,materialization=on,join_cache_hashed=on';
      SET join_cache_level = 3;

      CREATE TABLE t1 ( a INT, b INT, KEY(a) );
      INSERT INTO t1 VALUES
      (1,1),(2,2),(3,3),(4,4),(5,5),(6,6),(7,7);

      CREATE TABLE t2 ( c INT );
      INSERT INTO t2 VALUES (1),(2),(3),(4),(5),(6),(7);

      SELECT alias1.* FROM
      t1 AS alias1 INNER JOIN t1 AS alias2
      ON alias2.a = alias1.b
      WHERE alias1.b IN (
      SELECT a FROM t1, t2
      );

      Expected result:
      a b
      1 1
      2 2
      3 3
      4 4
      5 5
      6 6
      7 7

      Result:
      a b
      7 7

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            elenst Elena Stepanova added a comment -

            Re: Wrong result (missing rows) with join_cache_hashed+materialization+semijoin=on, join_cache_level > 2, JOIN, IN subquery
            This bug has also been filed in JIRA as MDEV-71.

            Show
            elenst Elena Stepanova added a comment - Re: Wrong result (missing rows) with join_cache_hashed+materialization+semijoin=on, join_cache_level > 2, JOIN, IN subquery This bug has also been filed in JIRA as MDEV-71 .
            Hide
            psergey Sergei Petrunia added a comment -

            Re: Wrong result (missing rows) with join_cache_hashed+materialization+semijoin=on, join_cache_level > 2, JOIN, IN subquery
            It's interesting that the reported query with INNER JOIN ... ON

            SELECT alias1.* FROM t1 AS alias1 INNER JOIN t1 AS alias2 ON alias2.a = alias1.b WHERE alias1.b IN (SELECT a FROM t1, t2);

            produces a correct result, while the query with WHERE:

            SELECT alias1.* FROM t1 AS alias1, t1 AS alias2 WHERE alias1.b=alias2.a AND alias1.b IN ( SELECT a FROM t1, t2);

            produces a wrong result.

            EXPLAINs are the same.

            Show
            psergey Sergei Petrunia added a comment - Re: Wrong result (missing rows) with join_cache_hashed+materialization+semijoin=on, join_cache_level > 2, JOIN, IN subquery It's interesting that the reported query with INNER JOIN ... ON SELECT alias1.* FROM t1 AS alias1 INNER JOIN t1 AS alias2 ON alias2.a = alias1.b WHERE alias1.b IN (SELECT a FROM t1, t2); produces a correct result, while the query with WHERE: SELECT alias1.* FROM t1 AS alias1, t1 AS alias2 WHERE alias1.b=alias2.a AND alias1.b IN ( SELECT a FROM t1, t2); produces a wrong result. EXPLAINs are the same.
            Hide
            psergey Sergei Petrunia added a comment -

            Re: Wrong result (missing rows) with join_cache_hashed+materialization+semijoin=on, join_cache_level > 2, JOIN, IN subquery
            .. and EXPLAIN EXTENDED are the same, too
            : select j3.alias1.a AS a,j3.alias1.b AS b from j3.t1 alias1 semi join (j3.t1 join j3.t2) join j3.t1 alias2 where (j3.alias2.a = j3.alias1.b)
            : select j3.alias1.a AS a,j3.alias1.b AS b from j3.t1 alias1 semi join (j3.t1 join j3.t2) join j3.t1 alias2 where (j3.alias2.a = j3.alias1.b)

            Show
            psergey Sergei Petrunia added a comment - Re: Wrong result (missing rows) with join_cache_hashed+materialization+semijoin=on, join_cache_level > 2, JOIN, IN subquery .. and EXPLAIN EXTENDED are the same, too : select j3.alias1.a AS a,j3.alias1.b AS b from j3.t1 alias1 semi join (j3.t1 join j3.t2) join j3.t1 alias2 where (j3.alias2.a = j3.alias1.b) : select j3.alias1.a AS a,j3.alias1.b AS b from j3.t1 alias1 semi join (j3.t1 join j3.t2) join j3.t1 alias2 where (j3.alias2.a = j3.alias1.b)
            Hide
            ratzpo Rasmus Johansson added a comment -

            Launchpad bug id: 912513

            Show
            ratzpo Rasmus Johansson added a comment - Launchpad bug id: 912513

              People

              • Assignee:
                psergey Sergei Petrunia
                Reporter:
                elenst Elena Stepanova
              • Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: