Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6818

Possibly wrong result with a query comparing a date field to a char constant.

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 10.1.1
    • Fix Version/s: 10.1
    • Component/s: Optimizer

      Description

      After the commit on 10.1 tree, the provided test case started producing a different result.

      commit 8bd4716272ef16a4bcd3196ba62f249aa3878998
      Merge: 26e048f f8f8a59
      Author: Sergei Petrunia <psergey@askmonty.org>
      Date:   Tue Sep 9 13:05:28 2014 +0400
      
          Merge ../10.1-orderby-fixes into 10.1
      

      I'm not quite sure which result is correct, but I want to make sure that the change was intentional.

      DROP TABLE IF EXISTS t1;
      CREATE TABLE t1 (a INT, b INT, d DATE, KEY(d));
      INSERT INTO t1 VALUES (1,7,'1900-01-01'),(4,0,'0000-00-00');
      SELECT * FROM t1 WHERE d = 'c' OR b >= a;
      DROP TABLE t1;
      
      Old result
      a	b	d
      1	7	1900-01-01
      4	0	0000-00-00
      
      Old EXPLAIN
      id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
      1	SIMPLE	t1	ALL	d	NULL	NULL	NULL	2	100.00	Using where
      Warnings:
      Note	1003	select `test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b`,`test`.`t1`.`d` AS `d` from `test`.`t1` where ((`test`.`t1`.`d` = 'c') or (`test`.`t1`.`b` >= `test`.`t1`.`a`))
      
      New result
      a	b	d
      1	7	1900-01-01
      
      New EXPLAIN
      id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
      1	SIMPLE	t1	ALL	d	NULL	NULL	NULL	2	100.00	Using where
      Warnings:
      Note	1003	select `test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b`,`test`.`t1`.`d` AS `d` from `test`.`t1` where ((`test`.`t1`.`b` >= `test`.`t1`.`a`))
      DROP TABLE t1;
      

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            psergey Sergei Petrunia added a comment - - edited

            So, the missing row is:

            a	b	d
            4	0	0000-00-00
            
            • "b>=a" is false for this row
            • d='c' is interesting.

            d ids defined as DATE, and then invalid date is compared with string literal 'c'.

            Show
            psergey Sergei Petrunia added a comment - - edited So, the missing row is: a b d 4 0 0000-00-00 "b>=a" is false for this row d='c' is interesting. d ids defined as DATE, and then invalid date is compared with string literal 'c'.
            Hide
            psergey Sergei Petrunia added a comment -

            Let's check how they compare without indexes:

            select *,  d = 'c', b >= a from t1;
            +------+------+------------+---------+--------+
            | a    | b    | d          | d = 'c' | b >= a |
            +------+------+------------+---------+--------+
            |    1 |    7 | 1900-01-01 |       0 |      1 |
            |    4 |    0 | 0000-00-00 |       1 |      0 |
            +------+------+------------+---------+--------+
            

            So, '0000-00-00' = 'c'.

            Show
            psergey Sergei Petrunia added a comment - Let's check how they compare without indexes: select *, d = 'c', b >= a from t1; +------+------+------------+---------+--------+ | a | b | d | d = 'c' | b >= a | +------+------+------------+---------+--------+ | 1 | 7 | 1900-01-01 | 0 | 1 | | 4 | 0 | 0000-00-00 | 1 | 0 | +------+------+------------+---------+--------+ So, '0000-00-00' = 'c'.
            Hide
            psergey Sergei Petrunia added a comment - - edited

            However, range optimizer has a different idea. Even in the old code (before the 8bd4716272ef16a4bcd3196ba62f249aa3878998 commit), it thinkd that d='c' can never be true:

            MariaDB [test]> SELECT * FROM t1 WHERE d = 'c';
            Empty set (0.00 sec)
            
            MariaDB [test]> explain SELECT * FROM t1 WHERE d = 'c';
            +------+-------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------+
            | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                                               |
            +------+-------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------+
            |    1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | Impossible WHERE noticed after reading const tables |
            +------+-------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------+
            

            and this does cause result mismatch:

            MariaDB [test]> SELECT * FROM t1 use index () WHERE d = 'c';
            +------+------+------------+
            | a    | b    | d          |
            +------+------+------------+
            |    4 |    0 | 0000-00-00 |
            +------+------+------------+
            
            Show
            psergey Sergei Petrunia added a comment - - edited However, range optimizer has a different idea. Even in the old code (before the 8bd4716272ef16a4bcd3196ba62f249aa3878998 commit), it thinkd that d='c' can never be true: MariaDB [test]> SELECT * FROM t1 WHERE d = 'c'; Empty set (0.00 sec) MariaDB [test]> explain SELECT * FROM t1 WHERE d = 'c'; +------+-------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------+ | 1 | SIMPLE | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Impossible WHERE noticed after reading const tables | +------+-------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------+ and this does cause result mismatch: MariaDB [test]> SELECT * FROM t1 use index () WHERE d = 'c'; +------+------+------------+ | a | b | d | +------+------+------------+ | 4 | 0 | 0000-00-00 | +------+------+------------+
            Hide
            psergey Sergei Petrunia added a comment -

            The query

            SELECT * FROM t1 WHERE d = 'c' OR b >= a;
            

            used to return the row with 0000-00-00, because "OR b>=a" prevented use of range optimizer.

            But then, we've got "MDEV-6480: Remove conditions for which range optimizer returned SEL_ARG::IMPOSSIBLE". The new code sees range optimizer's judgement that d='c' can never be true, and removes it from the WHERE. This causes the difference in query result (although it was possible to construct examples with inconsistent query results before, too).

            The solution is to reconcile range optimizer logic with item evaluation logic.

            Show
            psergey Sergei Petrunia added a comment - The query SELECT * FROM t1 WHERE d = 'c' OR b >= a; used to return the row with 0000-00-00, because "OR b>=a" prevented use of range optimizer. But then, we've got " MDEV-6480 : Remove conditions for which range optimizer returned SEL_ARG::IMPOSSIBLE". The new code sees range optimizer's judgement that d='c' can never be true, and removes it from the WHERE. This causes the difference in query result (although it was possible to construct examples with inconsistent query results before, too). The solution is to reconcile range optimizer logic with item evaluation logic.
            Hide
            psergey Sergei Petrunia added a comment -

            Test result from mysql-5.6.20 and mysql-5.6.21:

            MySQL [test]> SELECT * FROM t1 WHERE d = 'c' OR b >= a;
            +------+------+------------+
            | a    | b    | d          |
            +------+------+------------+
            |    1 |    7 | 1900-01-01 |
            +------+------+------------+
            

            It's the same as the new one that we've got. It seems, they have fixed the comparison to not return true:

            MySQL [test]> SELECT * FROM t1 use index () WHERE d = 'c';
            Empty set, 2 warnings (0.00 sec)
            

            Debugging the range optimizer, I can see that it produces SEL_TREE(NULL), unlike MariaDB which produces IMPOSSIBLE. This is actually weird - if item comparison was changed to evaluate to FALSE, why change range optimizer to not return IMPOSSIBLE?

            Show
            psergey Sergei Petrunia added a comment - Test result from mysql-5.6.20 and mysql-5.6.21: MySQL [test]> SELECT * FROM t1 WHERE d = 'c' OR b >= a; +------+------+------------+ | a | b | d | +------+------+------------+ | 1 | 7 | 1900-01-01 | +------+------+------------+ It's the same as the new one that we've got. It seems, they have fixed the comparison to not return true: MySQL [test]> SELECT * FROM t1 use index () WHERE d = 'c'; Empty set, 2 warnings (0.00 sec) Debugging the range optimizer, I can see that it produces SEL_TREE(NULL), unlike MariaDB which produces IMPOSSIBLE. This is actually weird - if item comparison was changed to evaluate to FALSE, why change range optimizer to not return IMPOSSIBLE?

              People

              • Assignee:
                psergey Sergei Petrunia
                Reporter:
                elenst Elena Stepanova
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated: