Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-3205

LP:1000269 - Wrong result (extra rows) with semijoin+materialization, IN subqueries, join_cache_level>0

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      The following query
      SELECT * FROM A, B
      WHERE b1 IN ( SELECT b2 FROM B WHERE b1 > 'o' )
      AND (
      b1 < 'l' OR
      a1 IN ( SELECT c1 FROM C )
      )

      on the test data produces 6 rows when executed with semijoin+materialization, and 3 rows otherwise. 3 rows is the correct result.

      bzr version-info
      revision-id: <email address hidden>
      date: 2012-05-15 08:31:07 +0300
      revno: 3523

      Also reproducible on maria/5.5 revno 3403.
      With the provided test case the problem is reproducible with MyISAM and InnoDB, but not Aria, (with Aria the plan is slightly different).

      Minimal optimizer_switch: materialization=on,semijoin=on
      Full optimizer_switch (default): index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on

      EXPLAIN (with minimal optimizer_switch and join_cache_level=2 which is current default):

      id select_type table type possible_keys key key_len ref rows filtered Extra
      1 PRIMARY A ALL NULL NULL NULL NULL 2 100.00
      1 PRIMARY <subquery2> ALL distinct_key NULL NULL NULL 4 100.00
      1 PRIMARY B ALL b1 NULL NULL NULL 6 83.33 Using where; Using join buffer (flat, BNL join)
      2 MATERIALIZED B ALL b1 NULL NULL NULL 6 66.67 Using where
      3 MATERIALIZED C ALL NULL NULL NULL NULL 2 100.00
      Warnings:
      Note 1003 select `test`.`A`.`a1` AS `a1`,`test`.`A`.`a2` AS `a2`,`test`.`B`.`b1` AS `b1`,`test`.`B`.`b2` AS `b2` from `test`.`A` semi join (`test`.`B`) join `test`.`B` where ((`test`.`B`.`b1` = `test`.`B`.`b2`) and ((`test`.`B`.`b2` < 'l') or <in_optimizer>(`test`.`A`.`a1`,`test`.`A`.`a1` in ( <materialize> (select `test`.`C`.`c1` from `test`.`C` ), <primary_index_lookup>(`test`.`A`.`a1` in <temporary table> on distinct_key where ((`test`.`A`.`a1` = `<subquery3>`.`c1`)))))) and (`test`.`B`.`b1` > 'o'))

      1. Test case

      SET optimizer_switch = 'materialization=on,semijoin=on';

      CREATE TABLE A (a1 VARCHAR(1), a2 VARCHAR(1))
      ENGINE=MyISAM;
      INSERT INTO A VALUES ('b','b'),('e','e');

      CREATE TABLE B (b1 VARCHAR(1), b2 VARCHAR(1), KEY(b1))
      ENGINE=MyISAM;
      INSERT INTO B VALUES
      ('v','v'),('s','s'),('l','l'),
      ('y','y'),('c','c'),('i','i');

      CREATE TABLE C (c1 VARCHAR(1)) ENGINE=MyISAM;
      INSERT INTO C VALUES ('b'),('c');

      SELECT * FROM A, B
      WHERE b1 IN ( SELECT b2 FROM B WHERE b1 > 'o' )
      AND (
      b1 < 'l' OR
      a1 IN ( SELECT c1 FROM C )
      );

      1. End of test case
      1. Expected result:
      2. a1 a2 b1 b2
      3. ----------------------
      4. b b v v
      5. b b s s
      6. b b y y
      1. Actual result:
      2. a1 a2 b1 b2
      3. ----------------------
      4. b b v v
      5. e e v v
      6. b b s s
      7. e e s s
      8. b b y y
      9. e e y y

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            psergey Sergei Petrunia added a comment -

            Re: Wrong result (extra rows) with semijoin+materialization, IN subqueries, join_cache_level>0
            The second subquery is redundant, one can replace it with list of constants

            SELECT * FROM A, B WHERE b1 IN ( SELECT b2 FROM B WHERE b1 > 'o' ) AND ( b1 < 'l' OR a1 IN ('b','c') );

            and the bug is still visible.

            Show
            psergey Sergei Petrunia added a comment - Re: Wrong result (extra rows) with semijoin+materialization, IN subqueries, join_cache_level>0 The second subquery is redundant, one can replace it with list of constants SELECT * FROM A, B WHERE b1 IN ( SELECT b2 FROM B WHERE b1 > 'o' ) AND ( b1 < 'l' OR a1 IN ('b','c') ); and the bug is still visible.
            Hide
            elenst Elena Stepanova added a comment -

            Re: Wrong result (extra rows) with semijoin+materialization, IN subqueries, join_cache_level>0
            Fix released in 5.5.24 and will be in 5.3.8 when it is out

            Show
            elenst Elena Stepanova added a comment - Re: Wrong result (extra rows) with semijoin+materialization, IN subqueries, join_cache_level>0 Fix released in 5.5.24 and will be in 5.3.8 when it is out
            Hide
            ratzpo Rasmus Johansson added a comment -

            Launchpad bug id: 1000269

            Show
            ratzpo Rasmus Johansson added a comment - Launchpad bug id: 1000269

              People

              • Assignee:
                psergey Sergei Petrunia
                Reporter:
                elenst Elena Stepanova
              • Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: