Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-3335

LP:849776 - Wrong result with semijoin + "Impossible where"

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      The following query:

      select * from t5 where (a) in (
      SELECT t1.a FROM t1
      LEFT JOIN t2 ON ( t1.a = t2.a ) WHERE t2.b NOT IN ( SELECT t4.b FROM t4 WHERE t4.b < t1.b ));

      returns no rows when executed with semijoin, even though the correct result must be "8", since table t5 contains the result from the subquery.

      mysql 5.5 also wrongly returns an empty result, though the plan there is different. Postgresql confirms that the correct result is "8".

      explain:

      1 PRIMARY NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables
      3 DEPENDENT SUBQUERY NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables

      minimal optimizer_switch: semijoin=on
      full optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,index_condition_pushdown=off,derived_merge=off,derived_with_keys=off,firstmatch=off,loosescan=off,materialization=off,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=off,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=off,semijoin_with_cache=off,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on

      bzr version-info:

      revision-id: <email address hidden>
      date: 2011-09-10 18:01:27 +0300
      build-date: 2011-09-14 11:28:19 +0300
      revno: 3183
      branch-nick: maria-5.3

      test case:

      CREATE TABLE t1 ( b varchar(1), a integer) ;
      INSERT INTO t1 VALUES ('z',8);

      CREATE TABLE t2 ( a integer, b varchar(1)) ;

      CREATE TABLE t4 ( a integer, b varchar(1)) ;

      CREATE TABLE t5 ( a integer) ;
      INSERT INTO t5 VALUES (8);

      set session optimizer_switch='semijoin=on';
      select * from t5 where (a) in (
      SELECT t1.a FROM t1
      LEFT JOIN t2 ON ( t1.a = t2.a ) WHERE t2.b NOT IN ( SELECT t4.b FROM t4 WHERE t4.b < t1.b ));

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            psergey Sergei Petrunia added a comment -

            Re: Wrong result with semijoin + "Impossible where"
            The wrong result is produced because outer join is converted into inner, which is wrong. The rewritten subquery looks like this:

            Message: select 8 AS `a` from `bug849776`.`t5` semi join (`bug849776`.`t1` join `bug849776`.`t2`) where 0

            the outer-to-inner conversion is made by simplify_joins(). I don't yet understand why this happens.

            Show
            psergey Sergei Petrunia added a comment - Re: Wrong result with semijoin + "Impossible where" The wrong result is produced because outer join is converted into inner, which is wrong. The rewritten subquery looks like this: Message: select 8 AS `a` from `bug849776`.`t5` semi join (`bug849776`.`t1` join `bug849776`.`t2`) where 0 the outer-to-inner conversion is made by simplify_joins(). I don't yet understand why this happens.
            Hide
            psergey Sergei Petrunia added a comment -

            Re: Wrong result with semijoin + "Impossible where"
            (gdb) wher 4
            #0 simplify_joins (join=0xa679d00, join_list=0xa65f188, conds=0xa679000, top=true, in_sj=true) at sql_select.cc:11892
            #1 0x0835d5e8 in simplify_joins (join=0xa679d00, join_list=0xa678ce8, conds=0xa679000, top=true, in_sj=true) at sql_select.cc:11857
            #2 0x0835d5e8 in simplify_joins (join=0xa679d00, join_list=0xa5fdd2c, conds=0xa679000, top=true, in_sj=false) at sql_select.cc:11857
            #3 0x08371073 in JOIN::optimize (this=0xa679d00) at sql_select.cc:960
            (More stack frames follow...)

            (gdb) p (Item*)conds
            $63 = (Item_cond_and *) 0xa679000
            (gdb) p dbug_print_item(((Item*)conds)->list.nth_element(0))
            $64 = 0x8b78000 "1"
            (gdb) p dbug_print_item(((Item*)conds)->list.nth_element(1))
            $65 = 0x8b78000 "(not(<in_optimizer>(`bug849776`.`t2`.`b`,<exists>(select `bug849776`.`t4`.`b` from `bug849776`.`t4` where (`bug849776`.`t4`.`b` < `bug849776`.`t1`.`b`)))))"
            (gdb) p dbug_print_item(((Item*)conds)->list.nth_element(2))
            $66 = 0x8b78000 "(`bug849776`.`t5`.`a` = `bug849776`.`t1`.`a`)"
            (gdb) p dbug_print_item(((Item*)conds)->list.nth_element(3))
            $67 = 0x88d9fc4 "(Item*)NULL"
            (gdb) p/t conds->used_tables()
            $68 = 111

            That is, the condition conds has form:

            1 AND t2.b NOT IN ( SELECT t4.b FROM t4 WHERE t4.b < t1.b ) AND t5.a=t1.a

            not_null_tables() attribute seems to be wrong:

            (gdb) p/t conds->not_null_tables()
            $69 = 111

              1. All three tables are NOT-NULL, this is what causes the left join be
              2. converted into inner join

            (gdb) p ((Item*)conds)>list.nth_element(0)>not_null_tables()
            $71 = 0
            (gdb) p ((Item*)conds)>list.nth_element(1)>not_null_tables()
            $72 = 0
            (gdb) p ((Item*)conds)>list.nth_element(2)>not_null_tables()
            $73 = 3

            1. attributes of branches are correct. The problem is with Item_cond_and's
            2. attributes.
            Show
            psergey Sergei Petrunia added a comment - Re: Wrong result with semijoin + "Impossible where" (gdb) wher 4 #0 simplify_joins (join=0xa679d00, join_list=0xa65f188, conds=0xa679000, top=true, in_sj=true) at sql_select.cc:11892 #1 0x0835d5e8 in simplify_joins (join=0xa679d00, join_list=0xa678ce8, conds=0xa679000, top=true, in_sj=true) at sql_select.cc:11857 #2 0x0835d5e8 in simplify_joins (join=0xa679d00, join_list=0xa5fdd2c, conds=0xa679000, top=true, in_sj=false) at sql_select.cc:11857 #3 0x08371073 in JOIN::optimize (this=0xa679d00) at sql_select.cc:960 (More stack frames follow...) (gdb) p (Item*)conds $63 = (Item_cond_and *) 0xa679000 (gdb) p dbug_print_item(((Item*)conds)->list.nth_element(0)) $64 = 0x8b78000 "1" (gdb) p dbug_print_item(((Item*)conds)->list.nth_element(1)) $65 = 0x8b78000 "(not(<in_optimizer>(`bug849776`.`t2`.`b`,<exists>(select `bug849776`.`t4`.`b` from `bug849776`.`t4` where (`bug849776`.`t4`.`b` < `bug849776`.`t1`.`b`)))))" (gdb) p dbug_print_item(((Item*)conds)->list.nth_element(2)) $66 = 0x8b78000 "(`bug849776`.`t5`.`a` = `bug849776`.`t1`.`a`)" (gdb) p dbug_print_item(((Item*)conds)->list.nth_element(3)) $67 = 0x88d9fc4 "(Item*)NULL" (gdb) p/t conds->used_tables() $68 = 111 That is, the condition conds has form: 1 AND t2.b NOT IN ( SELECT t4.b FROM t4 WHERE t4.b < t1.b ) AND t5.a=t1.a not_null_tables() attribute seems to be wrong: (gdb) p/t conds->not_null_tables() $69 = 111 All three tables are NOT-NULL, this is what causes the left join be converted into inner join (gdb) p ((Item*)conds) >list.nth_element(0) >not_null_tables() $71 = 0 (gdb) p ((Item*)conds) >list.nth_element(1) >not_null_tables() $72 = 0 (gdb) p ((Item*)conds) >list.nth_element(2) >not_null_tables() $73 = 3 attributes of branches are correct. The problem is with Item_cond_and's attributes.
            Hide
            psergey Sergei Petrunia added a comment -

            Re: Wrong result with semijoin + "Impossible where"
            This is where the problem lies:

            (gdb) wher 1
            #0 Item_func::fix_after_pullout (this=0xa678838, new_parent=0xa5fdc68, ref=0xa65fc90) at item_func.cc:262
            (More stack frames follow...)
            (gdb) p this
            $188 = (Item_in_optimizer *) 0xa678838
            (gdb) list
            257 (*arg)->fix_after_pullout(new_parent, arg);
            258 Item *item= *arg;
            259
            260 used_tables_cache|= item->used_tables();
            261 not_null_tables_cache|= item->not_null_tables();
            262 const_item_cache&= item->const_item();
            263 }
            264 }
            265 }
            266
            (gdb) p *arg
            $189 = (Item_field *) 0xa65f200

            Item_in_optimizer inherits fix_after_pullout() from Item_func::fix_after_pullout. The latter function assumes that NOT-null attributes of its arguments should be spread to itself, i.e.

            argument->not_null_tables() & BIT ==> this->not_null_tables() & BIT

            which is not true, particularly for IN that's inside NOT.

            Show
            psergey Sergei Petrunia added a comment - Re: Wrong result with semijoin + "Impossible where" This is where the problem lies: (gdb) wher 1 #0 Item_func::fix_after_pullout (this=0xa678838, new_parent=0xa5fdc68, ref=0xa65fc90) at item_func.cc:262 (More stack frames follow...) (gdb) p this $188 = (Item_in_optimizer *) 0xa678838 (gdb) list 257 (*arg)->fix_after_pullout(new_parent, arg); 258 Item *item= *arg; 259 260 used_tables_cache|= item->used_tables(); 261 not_null_tables_cache|= item->not_null_tables(); 262 const_item_cache&= item->const_item(); 263 } 264 } 265 } 266 (gdb) p *arg $189 = (Item_field *) 0xa65f200 Item_in_optimizer inherits fix_after_pullout() from Item_func::fix_after_pullout. The latter function assumes that NOT-null attributes of its arguments should be spread to itself, i.e. argument->not_null_tables() & BIT ==> this->not_null_tables() & BIT which is not true, particularly for IN that's inside NOT.
            Hide
            ratzpo Rasmus Johansson added a comment -

            Launchpad bug id: 849776

            Show
            ratzpo Rasmus Johansson added a comment - Launchpad bug id: 849776

              People

              • Assignee:
                psergey Sergei Petrunia
                Reporter:
                philipstoev Philip Stoev
              • Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: