Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-3016

LP:934348 - Wrong result (missing rows) with materialization+semijoin+join_cache_hashed, join_cache_level>2, HAVING

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      The following query

      SELECT a FROM t1
      WHERE a IN (
      SELECT b FROM t1, t2
      WHERE b = a
      )
      GROUP BY a
      HAVING a != 'z'

      on the test data returns only one row if it's run with materialization, semijoin, join_cache_hashed and join_cache_level>2, and two rows otherwise. The latter is correct.

      bzr version-info
      revision-id: <email address hidden>
      date: 2012-02-16 20:15:57 +0400
      build-date: 2012-02-17 21:19:42 +0400
      revno: 3424

      EXPLAIN with join_cache_level = 3 (wrong result):

      id select_type table type possible_keys key key_len ref rows filtered Extra
      1 PRIMARY <subquery2> ALL distinct_key NULL NULL NULL 2 100.00 Using temporary; Using filesort
      1 PRIMARY t1 hash_index a #hash#a:a 4:4 test.t1.a 3 66.67 Using where; Using join buffer (flat, BNLH join)
      2 MATERIALIZED t2 ALL NULL NULL NULL NULL 2 100.00 Using where
      2 MATERIALIZED t1 hash_index a #hash#a:a 4:4 test.t2.b 3 66.67 Using where; Using join buffer (flat, BNLH join)
      Warnings:
      Note 1003 select `test`.`t1`.`a` AS `a` from `test`.`t1` semi join (`test`.`t1` join `test`.`t2`) where ((`test`.`t1`.`a` = `test`.`t2`.`b`) and (`test`.`t1`.`a` = `test`.`t2`.`b`)) group by `test`.`t1`.`a` having (`test`.`t1`.`a` <> 'z')

      EXPLAIN with join_cache_level =2 (correct result):

      id select_type table type possible_keys key key_len ref rows filtered Extra
      1 PRIMARY <subquery2> ALL distinct_key NULL NULL NULL 2 100.00 Using temporary; Using filesort
      1 PRIMARY t1 ref a a 4 test.t2.b 2 100.00 Using index
      2 MATERIALIZED t2 ALL NULL NULL NULL NULL 2 100.00 Using where
      2 MATERIALIZED t1 ref a a 4 test.t2.b 2 100.00 Using index
      Warnings:
      Note 1003 select `test`.`t1`.`a` AS `a` from `test`.`t1` semi join (`test`.`t1` join `test`.`t2`) where ((`test`.`t1`.`a` = `test`.`t2`.`b`) and (`test`.`t1`.`a` = `test`.`t2`.`b`)) group by `test`.`t1`.`a` having (`test`.`t1`.`a` <> 'z')

      Minimal optimizer_switch: materialization=on,semijoin=on,join_cache_hashed=on
      (and join_cache_level>=3)
      Full optimizer_switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on

      1. Test case:

      SET optimizer_switch = 'materialization=on,semijoin=on,join_cache_hashed=on';

      1. join_cache_level >= 3
        SET join_cache_level = 3;
      1. MyISAM or Aria
        CREATE TABLE t1 ( a VARCHAR(1), KEY(a) ) ENGINE=MyISAM;
        INSERT INTO t1 VALUES ('c'),('v'),('c');
        CREATE TABLE t2 ( b VARCHAR(1) ) ENGINE=MyISAM;
        INSERT INTO t2 VALUES ('v'),('c');

      SELECT a FROM t1
      WHERE a IN (
      SELECT b FROM t1, t2
      WHERE b = a
      )
      GROUP BY a
      HAVING a != 'z';

      1. End of test case
      1. Expected result:
      2. a
      3. c
      4. v
      1. Actual result:
      2. a
      3. c
      1. There were 6 other similar cases during the test run.

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            igor Igor Babaev added a comment -

            Re: Wrong result (missing rows) with materialization+semijoin+join_cache_hashed, join_cache_level>2, HAVING
            The problem is seen already in the execution plan:

            MariaDB [test]> EXPLAIN SELECT a FROM t1 t WHERE t.a IN ( SELECT b FROM t1, t2 WHERE b = a ) GROUP BY t.a HAVING t.a != 'z';
            -----------------------------------------------------------------------------------------------------------

            id select_type table type possible_keys key key_len ref rows filtered Extra

            -----------------------------------------------------------------------------------------------------------

            1 PRIMARY <subquery2> ALL distinct_key NULL NULL NULL 2 100.00 Using temporary; Using filesort
            1 PRIMARY t ref a a 4 test.t1.a 2 100.00 Using where; Using index
            2 MATERIALIZED t2 ALL NULL NULL NULL NULL 2 100.00 Using where
            2 MATERIALIZED t1 ref a a 4 test.t2.b 2 100.00 Using where; Using index

            ---------------------------------------------------------------------------+------------------

            The key to access t is built oveer column a of the materialized table. Yet there is no such column there.

            Show
            igor Igor Babaev added a comment - Re: Wrong result (missing rows) with materialization+semijoin+join_cache_hashed, join_cache_level>2, HAVING The problem is seen already in the execution plan: MariaDB [test] > EXPLAIN SELECT a FROM t1 t WHERE t.a IN ( SELECT b FROM t1, t2 WHERE b = a ) GROUP BY t.a HAVING t.a != 'z'; --- ------------ ----------- ---- ------------- ---- ------- --------- ---- -------- -------------------------------- id select_type table type possible_keys key key_len ref rows filtered Extra --- ------------ ----------- ---- ------------- ---- ------- --------- ---- -------- -------------------------------- 1 PRIMARY <subquery2> ALL distinct_key NULL NULL NULL 2 100.00 Using temporary; Using filesort 1 PRIMARY t ref a a 4 test.t1.a 2 100.00 Using where; Using index 2 MATERIALIZED t2 ALL NULL NULL NULL NULL 2 100.00 Using where 2 MATERIALIZED t1 ref a a 4 test.t2.b 2 100.00 Using where; Using index --- ------------ ----------- ---- ------------- ---- ------- --------- ---- -------- + ------------------ The key to access t is built oveer column a of the materialized table. Yet there is no such column there.
            Hide
            igor Igor Babaev added a comment -

            Re: Wrong result (missing rows) with materialization+semijoin+join_cache_hashed, join_cache_level>2, HAVING
            The previous plan was obtained with the following settings:

            SET optimizer_switch = 'materialization=on,semijoin=on,join_cache_hashed=on';
            SET join_cache_level = 6;

            Show
            igor Igor Babaev added a comment - Re: Wrong result (missing rows) with materialization+semijoin+join_cache_hashed, join_cache_level>2, HAVING The previous plan was obtained with the following settings: SET optimizer_switch = 'materialization=on,semijoin=on,join_cache_hashed=on'; SET join_cache_level = 6;
            Hide
            elenst Elena Stepanova added a comment -

            Re: Wrong result (missing rows) with materialization+semijoin+join_cache_hashed, join_cache_level>2, HAVING
            Fix released with 5.3.5-ga.

            Show
            elenst Elena Stepanova added a comment - Re: Wrong result (missing rows) with materialization+semijoin+join_cache_hashed, join_cache_level>2, HAVING Fix released with 5.3.5-ga.
            Hide
            ratzpo Rasmus Johansson added a comment -

            Launchpad bug id: 934348

            Show
            ratzpo Rasmus Johansson added a comment - Launchpad bug id: 934348

              People

              • Assignee:
                igor Igor Babaev
                Reporter:
                elenst Elena Stepanova
              • Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: