Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-2886

LP:817384 - Wrong result with outer join + subquery in ON clause +unique key in maria-5.3

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      Not repeatable in maria-5.2,maria-5.1,mysql-5.5

      Even after the fix for 813447, the following query:

      SELECT t2.b FROM t1
      LEFT JOIN t2
      ON t1.col_int_nokey = t2.a
      AND ( t2.b , t1.b ) IN ( SELECT 'c' , 'd' );

      returns "a" which is wrong since the second part of the ON predicate is FALSE for all rows.

      explain:

      1 PRIMARY t1 system NULL NULL NULL NULL 1  
      1 PRIMARY t2 const PRIMARY PRIMARY 4 const 1  
      2 DEPENDENT SUBQUERY NULL NULL NULL NULL NULL NULL NULL No tables used

      optimizer_switch:

      index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,index_condition_pushdown=off,derived_merge=off,derived_with_keys=off,firstmatch=off,loosescan=off,materialization=off,in_to_exists=on,semijoin=off,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=off,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=off,semijoin_with_cache=off,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on

      bzr version-info
      revision-id: <email address hidden>
      date: 2011-07-25 21:52:15 -0700
      build-date: 2011-07-28 11:26:14 +0300
      revno: 3005
      branch-nick: maria-5.2

      test case:

      CREATE TABLE t1 ( c int NOT NULL , b varchar(32) NOT NULL ) ;
      INSERT INTO t1 VALUES (1,'b');

      CREATE TABLE t2 ( a int NOT NULL , b varchar(32) NOT NULL , PRIMARY KEY (a)) ;
      INSERT INTO t2 VALUES (1,'a');

      SELECT t2.b FROM t1
      LEFT JOIN t2
      ON t1.c = t2.a
      AND ( t2.b , t1.b ) IN (SELECT 'c' , 'd' );

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            timour Timour Katchaounov added a comment -

            Re: Wrong result with outer join + subquery in ON clause +unique key in maria-5.3
            Slightly more general example:

            CREATE TABLE t1 ( c int NOT NULL , b char(1) NOT NULL ) ;
            INSERT INTO t1 VALUES (1,'b');

            CREATE TABLE t2 ( a int NOT NULL , b char(1) NOT NULL , PRIMARY KEY (a)) ;
            INSERT INTO t2 VALUES (1,'a');

            create table t3 (c1 char(1), c2 char(2));
            insert into t3 values ('c','d');
            insert into t3 values ('c','d');

            SELECT t2.b
            FROM t1 LEFT JOIN t2 ON t1.c = t2.a AND ( t2.b , t1.b ) IN (SELECT * from t3);

            Show
            timour Timour Katchaounov added a comment - Re: Wrong result with outer join + subquery in ON clause +unique key in maria-5.3 Slightly more general example: CREATE TABLE t1 ( c int NOT NULL , b char(1) NOT NULL ) ; INSERT INTO t1 VALUES (1,'b'); CREATE TABLE t2 ( a int NOT NULL , b char(1) NOT NULL , PRIMARY KEY (a)) ; INSERT INTO t2 VALUES (1,'a'); create table t3 (c1 char(1), c2 char(2)); insert into t3 values ('c','d'); insert into t3 values ('c','d'); SELECT t2.b FROM t1 LEFT JOIN t2 ON t1.c = t2.a AND ( t2.b , t1.b ) IN (SELECT * from t3);
            Hide
            timour Timour Katchaounov added a comment -

            Re: Wrong result with outer join + subquery in ON clause +unique key in maria-5.3
            Analysis:

            The following simple queries demonstrates that the wrong result is a
            consequence of constant optimization:

            This query incorrectly produces a value:
            SELECT t2.b FROM t1 LEFT JOIN t2 ON (t2.b) IN (SELECT c2 from t3) AND t2.a = 1;
            ----------------------------------------------------------------------------------------

            id select_type table type possible_keys key key_len ref rows filtered Extra

            ----------------------------------------------------------------------------------------

            1 PRIMARY t1 system NULL NULL NULL NULL 1 100.00  
            1 PRIMARY t2 const PRIMARY PRIMARY 4 const 1 100.00  
            2 DEPENDENT SUBQUERY t3 ALL NULL NULL NULL NULL 2 100.00 Using where

            ----------------------------------------------------------------------------------------

            while this query works correctly:
            SELECT t2.b FROM t1 LEFT JOIN t2 ON (t2.b) IN (SELECT c2 from t3) AND t2.b = 'a';
            ------------------------------------------------------------------------------------

            id select_type table type possible_keys key key_len ref rows filtered Extra

            ------------------------------------------------------------------------------------

            1 PRIMARY t1 system NULL NULL NULL NULL 1 100.00  
            1 PRIMARY t2 ALL NULL NULL NULL NULL 1 100.00 Using where
            2 DEPENDENT SUBQUERY t3 ALL NULL NULL NULL NULL 2 100.00 Using where

            ------------------------------------------------------------------------------------

            In the wrong query constant optimization finds that the condition t2.a = 1 can be used
            to access the primary key of table 't2'. As a result both outer table t1,t2 are considered
            as constant when we reach the execution phase. At the same time, during constant
            optimization, the IN predicate is not evaluated because it is expensive.

            When execution of the outer query reaches do_select(), control flow enter the branch:
            if (join->table_count == join->const_tables)

            { ... }

            This branch checks only the WHERE and HAVING clauses, but doesn't check the ON
            clauses of the query. Since the IN predicate was not evaluated during optimization,
            it is not evaluated at all, thus execution doesn't detect that the ON clause is FALSE.

            Consider the second query that works correctly. In this query the second table is
            not constant. As a result do_select() executes the alternative IF branch, where
            it calls sub_select(). This execution path ends up calling evaluate_join_record(),
            which evaluates the ON conditions as well, where the IN predicate is evaluated
            and the query produces the correct result.

            Show
            timour Timour Katchaounov added a comment - Re: Wrong result with outer join + subquery in ON clause +unique key in maria-5.3 Analysis: The following simple queries demonstrates that the wrong result is a consequence of constant optimization: This query incorrectly produces a value: SELECT t2.b FROM t1 LEFT JOIN t2 ON (t2.b) IN (SELECT c2 from t3) AND t2.a = 1; --- ------------------ ----- ------ ------------- ------- ------- ----- ---- -------- ------------ id select_type table type possible_keys key key_len ref rows filtered Extra --- ------------------ ----- ------ ------------- ------- ------- ----- ---- -------- ------------ 1 PRIMARY t1 system NULL NULL NULL NULL 1 100.00   1 PRIMARY t2 const PRIMARY PRIMARY 4 const 1 100.00   2 DEPENDENT SUBQUERY t3 ALL NULL NULL NULL NULL 2 100.00 Using where --- ------------------ ----- ------ ------------- ------- ------- ----- ---- -------- ------------ while this query works correctly: SELECT t2.b FROM t1 LEFT JOIN t2 ON (t2.b) IN (SELECT c2 from t3) AND t2.b = 'a'; --- ------------------ ----- ------ ------------- ---- ------- ---- ---- -------- ------------ id select_type table type possible_keys key key_len ref rows filtered Extra --- ------------------ ----- ------ ------------- ---- ------- ---- ---- -------- ------------ 1 PRIMARY t1 system NULL NULL NULL NULL 1 100.00   1 PRIMARY t2 ALL NULL NULL NULL NULL 1 100.00 Using where 2 DEPENDENT SUBQUERY t3 ALL NULL NULL NULL NULL 2 100.00 Using where --- ------------------ ----- ------ ------------- ---- ------- ---- ---- -------- ------------ In the wrong query constant optimization finds that the condition t2.a = 1 can be used to access the primary key of table 't2'. As a result both outer table t1,t2 are considered as constant when we reach the execution phase. At the same time, during constant optimization, the IN predicate is not evaluated because it is expensive. When execution of the outer query reaches do_select(), control flow enter the branch: if (join->table_count == join->const_tables) { ... } This branch checks only the WHERE and HAVING clauses, but doesn't check the ON clauses of the query. Since the IN predicate was not evaluated during optimization, it is not evaluated at all, thus execution doesn't detect that the ON clause is FALSE. Consider the second query that works correctly. In this query the second table is not constant. As a result do_select() executes the alternative IF branch, where it calls sub_select(). This execution path ends up calling evaluate_join_record(), which evaluates the ON conditions as well, where the IN predicate is evaluated and the query produces the correct result.
            Hide
            ratzpo Rasmus Johansson added a comment -

            Launchpad bug id: 817384

            Show
            ratzpo Rasmus Johansson added a comment - Launchpad bug id: 817384

              People

              • Assignee:
                timour Timour Katchaounov
                Reporter:
                philipstoev Philip Stoev
              • Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: