Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-2165

LP:934342 - Wrong result (missing rows, wrong values) with materialization+semijoin+join_cache_hashed, join_cache_level>2, RIGHT JOIN, IN subquery

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      The following query

      SELECT *
      FROM t1 RIGHT JOIN t2 ON ( a = c )
      WHERE ( b, c ) IN
      ( SELECT b, c FROM t2 )

      on the test data returns wrong result (less rows and in one row a NULL instead of a value) with join_cache_level>2, and correct result otherwise.

      bzr version-info
      revision-id: <email address hidden>
      date: 2012-02-16 20:15:57 +0400
      build-date: 2012-02-17 21:10:34 +0400
      revno: 3424

      EXPLAIN with join_cache_level=3 (wrong result):

      id select_type table type possible_keys key key_len ref rows filtered Extra
      1 PRIMARY t2 ALL b NULL NULL NULL 3 100.00
      1 PRIMARY <subquery2> eq_ref distinct_key distinct_key 10 func,func 1 100.00
      1 PRIMARY t1 hash_index a #hash#a:a 4:4 test.t2.c 4 50.00 Using where; Using join buffer (flat, BNLH join)
      2 MATERIALIZED t2 ALL b NULL NULL NULL 3 100.00
      Warnings:
      Note 1003 select `test`.`t1`.`a` AS `a`,`test`.`t2`.`b` AS `b`,`test`.`t2`.`c` AS `c` from `test`.`t2` semi join (`test`.`t2`) left join `test`.`t1` on(((`test`.`t1`.`a` = `test`.`t2`.`c`) and (`test`.`t2`.`c` is not null))) where 1

      EXPLAIN with join_cache_level=2 (correct result):

      d select_type table type possible_keys key key_len ref rows filtered Extra
      1 PRIMARY t2 ALL b NULL NULL NULL 3 100.00
      1 PRIMARY <subquery2> eq_ref distinct_key distinct_key 10 func,func 1 100.00
      1 PRIMARY t1 ref a a 4 test.t2.c 2 100.00 Using where; Using index
      2 MATERIALIZED t2 ALL b NULL NULL NULL 3 100.00
      Warnings:
      Note 1003 select `test`.`t1`.`a` AS `a`,`test`.`t2`.`b` AS `b`,`test`.`t2`.`c` AS `c` from `test`.`t2` semi join (`test`.`t2`) left join `test`.`t1` on(((`test`.`t1`.`a` = `test`.`t2`.`c`) and (`test`.`t2`.`c` is not null))) where 1

      Minimal optimizer_switch: materialization=on,semijoin=on,join_cache_hashed=on
      (and join_cache_level>=3)
      Full optimizer_switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on

      Test case:

      SET optimizer_switch = 'materialization=on,semijoin=on,join_cache_hashed=on';

      1. join_cache_level >= 3
        SET SESSION join_cache_level = 3;

      CREATE TABLE t1 ( a VARCHAR(1), KEY(a) );
      INSERT INTO t1 VALUES ('v'),('v'),('s'),('j');

      CREATE TABLE t2 ( b VARCHAR(1), c VARCHAR(1), KEY(b) );
      INSERT INTO t2 VALUES ('v','v'),('w','w'),('t','t');

      SELECT *
      FROM t1 RIGHT JOIN t2 ON ( a = c )
      WHERE ( b, c ) IN
      ( SELECT b, c FROM t2 );

      1. End of test case
      1. Expected result:
      2. a b c
      3. -----------------
      4. v v v
      5. v v v
      6. NULL w w
      7. NULL t t
      1. Actual result:
      2. a b c
      3. -----------------
      4. NULL v v
      5. NULL w w
      6. NULL t t

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            igor Igor Babaev added a comment -

            Re: Wrong result (missing rows, wrong values) with materialization+semijoin+join_cache_hashed, join_cache_level>2, RIGHT JOIN, IN subquery
            The bug is reproducible with LEFT JOIN as well.

            With Elena's settings and
            set join_cache_level = 6
            I get the following execution plan:

            MariaDB [test]> EXPLAIN EXTENDED SELECT * FROM t2 LEFT JOIN t1 ON ( a = c ) WHERE ( b, c ) IN ( SELECT b, c FROM t2 t );
            --------------------------------------------------------------------------------------------------------------

            id select_type table type possible_keys key key_len ref rows filtered Extra

            --------------------------------------------------------------------------------------------------------------

            1 PRIMARY t2 ALL b NULL NULL NULL 3 100.00  
            1 PRIMARY <subquery2> eq_ref distinct_key distinct_key 10 func,func 1 100.00  
            1 PRIMARY t1 ref a a 4 test.t.c 2 100.00 Using where; Using index
            2 MATERIALIZED t ALL b NULL NULL NULL 3 100.00  

            --------------------------------------------------------------------------------------------------------------

            I get a wrong result set when executing by this plan.

            If I use
            set join_cache_level = 0;
            I get the execution plan:

            MariaDB [test]> EXPLAIN EXTENDED SELECT * FROM t2 LEFT JOIN t1 ON ( a = c ) WHERE ( b, c ) IN ( SELECT b, c FROM t2 t );
            --------------------------------------------------------------------------------------------------------------

            id select_type table type possible_keys key key_len ref rows filtered Extra

            --------------------------------------------------------------------------------------------------------------

            1 PRIMARY t2 ALL b NULL NULL NULL 3 100.00  
            1 PRIMARY <subquery2> eq_ref distinct_key distinct_key 10 func,func 1 100.00  
            1 PRIMARY t1 ref a a 4 test.t2.c 2 100.00 Using where; Using index
            2 MATERIALIZED t ALL b NULL NULL NULL 3 100.00  

            --------------------------------------------------------------------------------------------------------------
            4 rows in set, 1 warning (0.00 sec)

            I get the right result set when executing by this plan.

            The plans differ only in line 3:
            the first plan uses test.t.c to build the key to access table t1,
            while the second plan uses test.t2.c for this purpose.

            Show
            igor Igor Babaev added a comment - Re: Wrong result (missing rows, wrong values) with materialization+semijoin+join_cache_hashed, join_cache_level>2, RIGHT JOIN, IN subquery The bug is reproducible with LEFT JOIN as well. With Elena's settings and set join_cache_level = 6 I get the following execution plan: MariaDB [test] > EXPLAIN EXTENDED SELECT * FROM t2 LEFT JOIN t1 ON ( a = c ) WHERE ( b, c ) IN ( SELECT b, c FROM t2 t ); --- ------------ ----------- ------ ------------- ------------ ------- --------- ---- -------- ------------------------- id select_type table type possible_keys key key_len ref rows filtered Extra --- ------------ ----------- ------ ------------- ------------ ------- --------- ---- -------- ------------------------- 1 PRIMARY t2 ALL b NULL NULL NULL 3 100.00   1 PRIMARY <subquery2> eq_ref distinct_key distinct_key 10 func,func 1 100.00   1 PRIMARY t1 ref a a 4 test.t.c 2 100.00 Using where; Using index 2 MATERIALIZED t ALL b NULL NULL NULL 3 100.00   --- ------------ ----------- ------ ------------- ------------ ------- --------- ---- -------- ------------------------- I get a wrong result set when executing by this plan. If I use set join_cache_level = 0; I get the execution plan: MariaDB [test] > EXPLAIN EXTENDED SELECT * FROM t2 LEFT JOIN t1 ON ( a = c ) WHERE ( b, c ) IN ( SELECT b, c FROM t2 t ); --- ------------ ----------- ------ ------------- ------------ ------- --------- ---- -------- ------------------------- id select_type table type possible_keys key key_len ref rows filtered Extra --- ------------ ----------- ------ ------------- ------------ ------- --------- ---- -------- ------------------------- 1 PRIMARY t2 ALL b NULL NULL NULL 3 100.00   1 PRIMARY <subquery2> eq_ref distinct_key distinct_key 10 func,func 1 100.00   1 PRIMARY t1 ref a a 4 test.t2.c 2 100.00 Using where; Using index 2 MATERIALIZED t ALL b NULL NULL NULL 3 100.00   --- ------------ ----------- ------ ------------- ------------ ------- --------- ---- -------- ------------------------- 4 rows in set, 1 warning (0.00 sec) I get the right result set when executing by this plan. The plans differ only in line 3: the first plan uses test.t.c to build the key to access table t1, while the second plan uses test.t2.c for this purpose.
            Hide
            igor Igor Babaev added a comment -

            Re: Wrong result (missing rows, wrong values) with materialization+semijoin+join_cache_hashed, join_cache_level>2, RIGHT JOIN, IN subquery
            The first plan shows that the key is copied from table t, rather then from the materialized table. The last record in table t is ('t','t'). So the key to access table t1 always is built over this record. This explains why we have a wrong result for the first plan.

            Show
            igor Igor Babaev added a comment - Re: Wrong result (missing rows, wrong values) with materialization+semijoin+join_cache_hashed, join_cache_level>2, RIGHT JOIN, IN subquery The first plan shows that the key is copied from table t, rather then from the materialized table. The last record in table t is ('t','t'). So the key to access table t1 always is built over this record. This explains why we have a wrong result for the first plan.
            Hide
            elenst Elena Stepanova added a comment -

            Re: Wrong result (missing rows, wrong values) with materialization+semijoin+join_cache_hashed, join_cache_level>2, RIGHT JOIN, IN subquery
            Fix released with 5.3.5-ga.

            Show
            elenst Elena Stepanova added a comment - Re: Wrong result (missing rows, wrong values) with materialization+semijoin+join_cache_hashed, join_cache_level>2, RIGHT JOIN, IN subquery Fix released with 5.3.5-ga.
            Hide
            ratzpo Rasmus Johansson added a comment -

            Launchpad bug id: 934342

            Show
            ratzpo Rasmus Johansson added a comment - Launchpad bug id: 934342

              People

              • Assignee:
                igor Igor Babaev
                Reporter:
                elenst Elena Stepanova
              • Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: