Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-7149

Constant propagation erroneously applied for LIKE

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 10.1, 10.0, 5.5
    • Fix Version/s: 10.0.16
    • Component/s: Optimizer
    • Labels:
      None

      Description

      SET NAMES utf8 COLLATE utf8_german2_ci;
      DROP TABLE IF EXISTS t1;
      CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8 COLLATE utf8_german2_ci);
      INSERT INTO t1 VALUES ('ae'),('ä');
      SELECT * FROM t1 WHERE a='ae';
      SELECT * FROM t1 WHERE a LIKE 'ä';
      

      returns these results:

      +------+
      | a    |
      +------+
      | ae   |
      | ä    |
      +------+
      2 rows in set (0.01 sec)
      
      +------+
      | a    |
      +------+
      | ä    |
      +------+
      1 row in set (0.00 sec)
      

      This is correct. Equality works taking into account contractions and expansions and returns both rows, while LIKE is performed one-character-to-one-character, so only one record matches.

      Now if I join both conditions with AND in a single WHERE, I expect one row with 'ä' to be returned, as it matches both conditions:

      SELECT * FROM t1 WHERE a='ae' AND a LIKE 'ä';
      

      But in fact I get an empty set.

      EXPLAIN EXTENDED returns "Impossible where", which is not correct:

      mysql> EXPLAIN EXTENDED SELECT * FROM t1 WHERE a='ae' AND a LIKE 'ä';
      +------+-------------+-------+------+---------------+------+---------+------+------+----------+------------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra            |
      +------+-------------+-------+------+---------------+------+---------+------+------+----------+------------------+
      |    1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL |     NULL | Impossible WHERE |
      +------+-------------+-------+------+---------------+------+---------+------+------+----------+------------------+
      

      After tracing the code, it seems that constant propagation was erroneously applied.

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            bar Alexander Barkov added a comment -

            The same effect is observed using trailing spaces instead of a contraction:

            SET NAMES utf8 COLLATE utf8_unicode_ci;
            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET utf8 COLLATE utf8_unicode_ci);
            INSERT INTO t1 VALUES ('a '),('a');
            SELECT * FROM t1 WHERE a='a';
            SELECT * FROM t1 WHERE a LIKE 'a ';
            SELECT * FROM t1 WHERE a='a' AND a LIKE 'a ';
            

            The first query correctly returns 2 rows.
            The second query correctly returns 1 row.
            The third query returns no rows, which is wrong.

            Show
            bar Alexander Barkov added a comment - The same effect is observed using trailing spaces instead of a contraction: SET NAMES utf8 COLLATE utf8_unicode_ci; DROP TABLE IF EXISTS t1; CREATE TABLE t1 (a VARCHAR (10) CHARACTER SET utf8 COLLATE utf8_unicode_ci); INSERT INTO t1 VALUES ('a '),('a'); SELECT * FROM t1 WHERE a='a'; SELECT * FROM t1 WHERE a LIKE 'a '; SELECT * FROM t1 WHERE a='a' AND a LIKE 'a '; The first query correctly returns 2 rows. The second query correctly returns 1 row. The third query returns no rows, which is wrong.
            Hide
            bar Alexander Barkov added a comment -

            it's also reproducible even with simple collations:

            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET latin1);
            INSERT INTO t1 VALUES ('a'),('a ');
            SET NAMES latin1;
            SELECT * FROM t1 WHERE CONCAT(a)='a';
            SELECT * FROM t1 WHERE CONCAT(a) LIKE 'a ';
            SELECT * FROM t1 WHERE CONCAT(a)='a' AND CONCAT(a) LIKE 'a ';
            

            The first query correctly returns two rows.
            The second query correctly returns one rows.
            The third query returns no rows (while one row is expected).

            Show
            bar Alexander Barkov added a comment - it's also reproducible even with simple collations: DROP TABLE IF EXISTS t1; CREATE TABLE t1 (a VARCHAR (10) CHARACTER SET latin1); INSERT INTO t1 VALUES ('a'),('a '); SET NAMES latin1; SELECT * FROM t1 WHERE CONCAT(a)='a'; SELECT * FROM t1 WHERE CONCAT(a) LIKE 'a '; SELECT * FROM t1 WHERE CONCAT(a)='a' AND CONCAT(a) LIKE 'a '; The first query correctly returns two rows. The second query correctly returns one rows. The third query returns no rows (while one row is expected).
            Hide
            bar Alexander Barkov added a comment -

            It's also reproducible if the field is on the right side of LIKE:

            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET latin1);
            INSERT INTO t1 VALUES ('a'),('a ');
            SET NAMES latin1;
            SELECT * FROM t1 WHERE 'a'=CONCAT(a);
            SELECT * FROM t1 WHERE 'a ' LIKE CONCAT(a);
            SELECT * FROM t1 WHERE 'a'=CONCAT(a) AND 'a ' LIKE CONCAT(a);
            

            The first query correctly returns two rows.
            The second query correctly returns one rows.
            The third query returns no rows (while one row is expected).

            Show
            bar Alexander Barkov added a comment - It's also reproducible if the field is on the right side of LIKE: DROP TABLE IF EXISTS t1; CREATE TABLE t1 (a VARCHAR (10) CHARACTER SET latin1); INSERT INTO t1 VALUES ('a'),('a '); SET NAMES latin1; SELECT * FROM t1 WHERE 'a'=CONCAT(a); SELECT * FROM t1 WHERE 'a ' LIKE CONCAT(a); SELECT * FROM t1 WHERE 'a'=CONCAT(a) AND 'a ' LIKE CONCAT(a); The first query correctly returns two rows. The second query correctly returns one rows. The third query returns no rows (while one row is expected).
            Hide
            bar Alexander Barkov added a comment - - edited

            More examples:

            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET latin1);
            INSERT INTO t1 VALUES ('%'),('% ');
            SET NAMES latin1;
            SELECT * FROM t1 WHERE '% '=CONCAT(a);
            SELECT * FROM t1 WHERE 'a' LIKE CONCAT(a);
            SELECT * FROM t1 WHERE '% '=CONCAT(a) AND 'a' LIKE CONCAT(a);
            

            The first query correctly returns two rows.
            The second query correctly returns one rows.
            The third query returns no rows (while one row is expected).

            Show
            bar Alexander Barkov added a comment - - edited More examples: DROP TABLE IF EXISTS t1; CREATE TABLE t1 (a VARCHAR (10) CHARACTER SET latin1); INSERT INTO t1 VALUES ('%'),('% '); SET NAMES latin1; SELECT * FROM t1 WHERE '% '=CONCAT(a); SELECT * FROM t1 WHERE 'a' LIKE CONCAT(a); SELECT * FROM t1 WHERE '% '=CONCAT(a) AND 'a' LIKE CONCAT(a); The first query correctly returns two rows. The second query correctly returns one rows. The third query returns no rows (while one row is expected).
            Hide
            bar Alexander Barkov added a comment -

            This example also demonstrates a wrong behaviour, but different from the previous ones:
            The last query returns two rows:

            • unlike no rows in the previous examples
            • and instead of one row, which would be a correct result
            DROP TABLE IF EXISTS t1;
            CREATE TABLE t1 (a VARCHAR(10) CHARACTER SET latin1);
            INSERT INTO t1 VALUES ('%'),('% ');
            SET NAMES latin1;
            SELECT * FROM t1 WHERE '%'=CONCAT(a);
            SELECT * FROM t1 WHERE 'a' LIKE CONCAT(a);
            SELECT * FROM t1 WHERE '%'=CONCAT(a) AND 'a' LIKE CONCAT(a);
            

            The first query correctly returns two rows.
            The second query correctly returns one rows.
            The third query returns two rows (while one row is expected).

            Show
            bar Alexander Barkov added a comment - This example also demonstrates a wrong behaviour, but different from the previous ones: The last query returns two rows: unlike no rows in the previous examples and instead of one row, which would be a correct result DROP TABLE IF EXISTS t1; CREATE TABLE t1 (a VARCHAR (10) CHARACTER SET latin1); INSERT INTO t1 VALUES ('%'),('% '); SET NAMES latin1; SELECT * FROM t1 WHERE '%'=CONCAT(a); SELECT * FROM t1 WHERE 'a' LIKE CONCAT(a); SELECT * FROM t1 WHERE '%'=CONCAT(a) AND 'a' LIKE CONCAT(a); The first query correctly returns two rows. The second query correctly returns one rows. The third query returns two rows (while one row is expected).

              People

              • Assignee:
                bar Alexander Barkov
                Reporter:
                bar Alexander Barkov
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: