Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6634

Wrong estimates for ref(const) and key IS NULL predicate

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 5.5.39, 10.0.13
    • Fix Version/s: 10.1.1
    • Component/s: None
    • Labels:

      Description

      Create the dataset:

      create table ten(a int);
      insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
      create table one_k(a int);
      insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C;
      
      create table t1p (                                                                                                      
        pk int primary key,                                                                                                             
        removed enum('true') DEFAULT NULL,                                                                                              
        key(removed)                                                                                                                    
      ) engine=innodb pack_keys=1;                                                                                                      
      
      # This inserts 10M records:
      insert into t1p
       select 
         A.a + 1000*B.a+1000*1000*C.a, 
         IF(A.a+1000*B.a > 100, NULL, 'true')
       from one_k A, one_k B,ten C;
      

      Let's explore the dataset

      MariaDB [test2]> explain select count(*) from t1p force index(removed) where removed is null;
      +------+-------------+-------+------+---------------+---------+---------+-------+---------+--------------------------+
      | id   | select_type | table | type | possible_keys | key     | key_len | ref   | rows    | Extra                    |
      +------+-------------+-------+------+---------------+---------+---------+-------+---------+--------------------------+
      |    1 | SIMPLE      | t1p   | ref  | removed       | removed | 2       | const | 5003505 | Using where; Using index |
      +------+-------------+-------+------+---------------+---------+---------+-------+---------+--------------------------+
      

      Here, it underestimates the amount of NULLs (by about two times). This is not a problem, yet.

      MariaDB [test2]> explain select count(*) from t1p force index(removed) where removed is not null;
      +------+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
      | id   | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra                    |
      +------+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
      |    1 | SIMPLE      | t1p   | range | removed       | removed | 2       | NULL | 1009 | Using where; Using index |
      +------+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
      

      This is close to reality.

      MariaDB [test2]> explain
          -> select * from ten left join t1p on ten.a=3 and t1p.removed is null;
      +------+-------------+-------+------+---------------+---------+---------+-------+------+--------------------------+
      | id   | select_type | table | type | possible_keys | key     | key_len | ref   | rows | Extra                    |
      +------+-------------+-------+------+---------------+---------+---------+-------+------+--------------------------+
      |    1 | SIMPLE      | ten   | ALL  | NULL          | NULL    | NULL    | NULL  |   10 |                          |
      |    1 | SIMPLE      | t1p   | ref  | removed       | removed | 2       | const |   10 | Using where; Using index |
      +------+-------------+-------+------+---------------+---------+---------+-------+------+--------------------------+
      

      Now, here the scan on "removed IS NULL" produces 10 rows, even if we saw above that the optimizer knows it will produce about 1K rows.

      This might be not a problem for this particular query, but it may cause the optimizer not to pick a good query plan.

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

              Hide
              psergey Sergei Petrunia added a comment -

              On MariaDB 10.0:

              MariaDB [j100]> explain select * from ten left join t1p on ten.a=3 and t1p.removed is null;
              +------+-------------+-------+------+---------------+---------+---------+-------+---------+--------------------------+
              | id   | select_type | table | type | possible_keys | key     | key_len | ref   | rows    | Extra                    |
              +------+-------------+-------+------+---------------+---------+---------+-------+---------+--------------------------+
              |    1 | SIMPLE      | ten   | ALL  | NULL          | NULL    | NULL    | NULL  |      10 |                          |
              |    1 | SIMPLE      | t1p   | ref  | removed       | removed | 2       | const | 1189654 | Using where; Using index |
              +------+-------------+-------+------+---------------+---------+---------+-------+---------+--------------------------+
              

              That is, the problem I can observe on MariaDB 5.5 cannot be seen on MariaDB 10.0.

              Show
              psergey Sergei Petrunia added a comment - On MariaDB 10.0: MariaDB [j100]> explain select * from ten left join t1p on ten.a=3 and t1p.removed is null; +------+-------------+-------+------+---------------+---------+---------+-------+---------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+---------+---------+-------+---------+--------------------------+ | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | | | 1 | SIMPLE | t1p | ref | removed | removed | 2 | const | 1189654 | Using where; Using index | +------+-------------+-------+------+---------------+---------+---------+-------+---------+--------------------------+ That is, the problem I can observe on MariaDB 5.5 cannot be seen on MariaDB 10.0.
              Hide
              psergey Sergei Petrunia added a comment -

              More details about what happens on 5.5:

              Range optimizer is actually unable to make use of condition "t1p.removed is null". This is because of this (incorrect) old code in opt_range.cc, get_mm_leaf():

                  if (field->table->maybe_null)		// Can't use a key on this
                    goto end;
              

              Then, best_access_path() find that
              1. this is ref(const) // i.e. found_ref==FALSE
              2. However, table->quick_keys.is_set(key)==FALSE ( no possible quick select).

              which causes this formula to be used:

                              /* quick_range couldn't use key! */
                              records= (double) s->records/rec;
              

              where rec has the value from the start of the function:

                  rec= s->records/MATCHING_ROWS_IN_OTHER_TABLE;  // Assumed records/key
              

              and we get records = s->records / (s->records / MATCHING_ROWS_IN_OTHER_TABLE)= MATCHING_ROWS_IN_OTHER_TABLE= 10.

              which is a really bad estimate (and note that it is always equal to 10. This is even worse than using an imprecise estimate from rec_per_key).

              Show
              psergey Sergei Petrunia added a comment - More details about what happens on 5.5: Range optimizer is actually unable to make use of condition "t1p.removed is null". This is because of this (incorrect) old code in opt_range.cc, get_mm_leaf(): if (field->table->maybe_null) // Can't use a key on this goto end; Then, best_access_path() find that 1. this is ref(const) // i.e. found_ref==FALSE 2. However, table->quick_keys.is_set(key)==FALSE ( no possible quick select). which causes this formula to be used: /* quick_range couldn't use key! */ records= (double) s->records/rec; where rec has the value from the start of the function: rec= s->records/MATCHING_ROWS_IN_OTHER_TABLE; // Assumed records/key and we get records = s->records / (s->records / MATCHING_ROWS_IN_OTHER_TABLE)= MATCHING_ROWS_IN_OTHER_TABLE= 10. which is a really bad estimate (and note that it is always equal to 10. This is even worse than using an imprecise estimate from rec_per_key).
              Hide
              psergey Sergei Petrunia added a comment -

              The problem is not seen on MariaDB 10.0, because 10.0 has extended_keys enabled by default. When extended keys is ON, execution in best_access_path() takes a different branch.

              If I disable extended_keys in 10.0.13, I get the same bad estimate:

              MariaDB [j100]> set optimizer_switch='extended_keys=off';
              Query OK, 0 rows affected (0.00 sec)
              
              MariaDB [j100]> explain select * from ten left join t1p on ten.a=3 and t1p.removed is null;
              +------+-------------+-------+------+---------------+---------+---------+-------+------+--------------------------+
              | id   | select_type | table | type | possible_keys | key     | key_len | ref   | rows | Extra                    |
              +------+-------------+-------+------+---------------+---------+---------+-------+------+--------------------------+
              |    1 | SIMPLE      | ten   | ALL  | NULL          | NULL    | NULL    | NULL  |   10 |                          |
              |    1 | SIMPLE      | t1p   | ref  | removed       | removed | 2       | const |   10 | Using where; Using index |
              +------+-------------+-------+------+---------------+---------+---------+-------+------+--------------------------+
              

              Debugging 10.0, I see that the problematic code in opt_range.cc, get_mm_leaf() is still there.

              Show
              psergey Sergei Petrunia added a comment - The problem is not seen on MariaDB 10.0, because 10.0 has extended_keys enabled by default. When extended keys is ON, execution in best_access_path() takes a different branch. If I disable extended_keys in 10.0.13, I get the same bad estimate: MariaDB [j100]> set optimizer_switch='extended_keys=off'; Query OK, 0 rows affected (0.00 sec) MariaDB [j100]> explain select * from ten left join t1p on ten.a=3 and t1p.removed is null; +------+-------------+-------+------+---------------+---------+---------+-------+------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+---------+---------+-------+------+--------------------------+ | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | | | 1 | SIMPLE | t1p | ref | removed | removed | 2 | const | 10 | Using where; Using index | +------+-------------+-------+------+---------------+---------+---------+-------+------+--------------------------+ Debugging 10.0, I see that the problematic code in opt_range.cc, get_mm_leaf() is still there.
              Hide
              psergey Sergei Petrunia added a comment -

              In 5.5, it's other way around. if I turn extended_keys to ON, the problem goes away:

              MariaDB [j100]> explain select * from ten left join t1p on ten.a=3 and t1p.removed is null;
              +------+-------------+-------+------+---------------+---------+---------+-------+------+--------------------------+
              | id   | select_type | table | type | possible_keys | key     | key_len | ref   | rows | Extra                    |
              +------+-------------+-------+------+---------------+---------+---------+-------+------+--------------------------+
              |    1 | SIMPLE      | ten   | ALL  | NULL          | NULL    | NULL    | NULL  |   10 |                          |
              |    1 | SIMPLE      | t1p   | ref  | removed       | removed | 2       | const |   10 | Using where; Using index |
              +------+-------------+-------+------+---------------+---------+---------+-------+------+--------------------------+
              
              MariaDB [j100]> set optimizer_switch='extended_keys=on';
              
              MariaDB [j100]> explain select * from ten left join t1p on ten.a=3 and t1p.removed is null;
              +------+-------------+-------+------+---------------+---------+---------+-------+--------+--------------------------+
              | id   | select_type | table | type | possible_keys | key     | key_len | ref   | rows   | Extra                    |
              +------+-------------+-------+------+---------------+---------+---------+-------+--------+--------------------------+
              |    1 | SIMPLE      | ten   | ALL  | NULL          | NULL    | NULL    | NULL  |     10 |                          |
              |    1 | SIMPLE      | t1p   | ref  | removed       | removed | 2       | const | 555739 | Using where; Using index |
              +------+-------------+-------+------+---------------+---------+---------+-------+--------+--------------------------+
              
              Show
              psergey Sergei Petrunia added a comment - In 5.5, it's other way around. if I turn extended_keys to ON, the problem goes away: MariaDB [j100]> explain select * from ten left join t1p on ten.a=3 and t1p.removed is null; +------+-------------+-------+------+---------------+---------+---------+-------+------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+---------+---------+-------+------+--------------------------+ | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | | | 1 | SIMPLE | t1p | ref | removed | removed | 2 | const | 10 | Using where; Using index | +------+-------------+-------+------+---------------+---------+---------+-------+------+--------------------------+ MariaDB [j100]> set optimizer_switch='extended_keys=on'; MariaDB [j100]> explain select * from ten left join t1p on ten.a=3 and t1p.removed is null; +------+-------------+-------+------+---------------+---------+---------+-------+--------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+---------+---------+-------+--------+--------------------------+ | 1 | SIMPLE | ten | ALL | NULL | NULL | NULL | NULL | 10 | | | 1 | SIMPLE | t1p | ref | removed | removed | 2 | const | 555739 | Using where; Using index | +------+-------------+-------+------+---------------+---------+---------+-------+--------+--------------------------+
              Hide
              psergey Sergei Petrunia added a comment -

              Pushed a patch to 10.1 tree. I think, we could discuss back-porting it to some earlier releases.

              Show
              psergey Sergei Petrunia added a comment - Pushed a patch to 10.1 tree. I think, we could discuss back-porting it to some earlier releases.

                People

                • Assignee:
                  psergey Sergei Petrunia
                  Reporter:
                  psergey Sergei Petrunia
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  2 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved: