Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5917

EITS: different order of predicates in IN (...) causes different estimates

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 10.0.9
    • Fix Version/s: 10.0.10
    • Component/s: None
    • Labels:

      Description

      Create the dataset:

      create table ten(a int);
      insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
      create table one_k(a int);
      insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C;
      create table t1 (col1 int);
      insert into t1 select a from one_k;
      insert into t1 select 2000+a from one_k;
      insert into t1 select 2000+a from one_k;
      insert into t1 select 2000+a from one_k;
      insert into t1 select 2000+a from one_k;
      insert into t1 select 2000+a from one_k;
      insert into t1 select 2000+a from one_k;
      insert into t1 select 2000+a from one_k;
      insert into t1 select 2000+a from one_k;
      insert into t1 select 2000+a from one_k;
      insert into t1 select 2000+a from one_k;
      set use_stat_tables = 'preferably';
      set optimizer_use_condition_selectivity=3;
      set optimizer_use_condition_selectivity=4;
      set histogram_size=100;
      analyze table t1 persistent for all; 
      

      Then run:

      mysql> explain extended select * from t1 where col1 in (2990, 10);
      +------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows  | filtered | Extra       |
      +------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
      |    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL | 11000 |     0.03 | Using where |
      +------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
      

      And then run:

      mysql> explain extended select * from t1 where col1 in (10,2990);
      +------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows  | filtered | Extra       |
      +------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
      |    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL | 11000 |     0.25 | Using where |
      +------+-------------+-------+------+---------------+------+---------+------+-------+----------+-------------+
      

      Note that IN predicate is essentailly the same, but "filtered" changes.

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            psergey Sergei Petrunia added a comment -

            If I set a breakpoint at Field::pos_in_interval_val_real (...)

            It hits here:

              #0  Field::pos_in_interval_val_real (...)
              #1  0x000000000083ac5d in Field_num::pos_in_interval (...)
              #2  0x00000000006f0dd0 in get_column_range_cardinality (...)
              #3  0x000000000095e40f in records_in_column_ranges (...)
              #4  0x000000000095e82d in calculate_cond_selectivity_for_table (...)
              #5  0x000000000069485c in make_join_statistics (...)
              #6  0x000000000068b57f in JOIN::optimize_inner (...)
              #7  0x000000000068a4ee in JOIN::optimize (...)
            

            When I'm running this query:
            explain extended select * from t1 where col1 in (10,2990);

            I observe

              Breakpoint 6, Field::pos_in_interval_val_real (this=0x7fffd0890ca8, min=0x7fffd0891258, max=0x7fffd0891318) at /home/psergey/dev2/10.0-cp/sql/field.cc:1155
            (gdb) p val_real()
              $177 = 2990
            (gdb) c
              Continuing.
              
              Breakpoint 6, Field::pos_in_interval_val_real (this=0x7fffd0890ca8, min=0x7fffd0891258, max=0x7fffd0891318) at /home/psergey/dev2/10.0-cp/sql/field.cc:1155
            (gdb) p val_real()
              $178 = 2990
            

            On the other hand, if I run the query
            explain extended select * from t1 where col1 in (2990, 10);
            I see

              Breakpoint 6, Field::pos_in_interval_val_real (this=0x7fffd0890ca8, min=0x7fffd0891258, max=0x7fffd0891318) at /home/psergey/dev2/10.0-cp/sql/field.cc:1155
            (gdb) p val_real()
              $180 = 10
            (gdb) c
              Continuing.
              
              Breakpoint 6, Field::pos_in_interval_val_real (this=0x7fffd0890ca8, min=0x7fffd0891258, max=0x7fffd0891318) at /home/psergey/dev2/10.0-cp/sql/field.cc:1155
            (gdb) p val_real()
              $181 = 10
            
            Show
            psergey Sergei Petrunia added a comment - If I set a breakpoint at Field::pos_in_interval_val_real (...) It hits here: #0 Field::pos_in_interval_val_real (...) #1 0x000000000083ac5d in Field_num::pos_in_interval (...) #2 0x00000000006f0dd0 in get_column_range_cardinality (...) #3 0x000000000095e40f in records_in_column_ranges (...) #4 0x000000000095e82d in calculate_cond_selectivity_for_table (...) #5 0x000000000069485c in make_join_statistics (...) #6 0x000000000068b57f in JOIN::optimize_inner (...) #7 0x000000000068a4ee in JOIN::optimize (...) When I'm running this query: explain extended select * from t1 where col1 in (10,2990); I observe Breakpoint 6, Field::pos_in_interval_val_real (this=0x7fffd0890ca8, min=0x7fffd0891258, max=0x7fffd0891318) at /home/psergey/dev2/10.0-cp/sql/field.cc:1155 (gdb) p val_real() $177 = 2990 (gdb) c Continuing. Breakpoint 6, Field::pos_in_interval_val_real (this=0x7fffd0890ca8, min=0x7fffd0891258, max=0x7fffd0891318) at /home/psergey/dev2/10.0-cp/sql/field.cc:1155 (gdb) p val_real() $178 = 2990 On the other hand, if I run the query explain extended select * from t1 where col1 in (2990, 10); I see Breakpoint 6, Field::pos_in_interval_val_real (this=0x7fffd0890ca8, min=0x7fffd0891258, max=0x7fffd0891318) at /home/psergey/dev2/10.0-cp/sql/field.cc:1155 (gdb) p val_real() $180 = 10 (gdb) c Continuing. Breakpoint 6, Field::pos_in_interval_val_real (this=0x7fffd0890ca8, min=0x7fffd0891258, max=0x7fffd0891318) at /home/psergey/dev2/10.0-cp/sql/field.cc:1155 (gdb) p val_real() $181 = 10
            Hide
            psergey Sergei Petrunia added a comment -

            Apparently, records_in_column_ranges() walks over ranges but doesn't save their values into table->record[0]. We always calculate Field_num::pos_in_interval() for whatever was the last value considered by get_mm_tree().

            Show
            psergey Sergei Petrunia added a comment - Apparently, records_in_column_ranges() walks over ranges but doesn't save their values into table->record [0] . We always calculate Field_num::pos_in_interval() for whatever was the last value considered by get_mm_tree().

              People

              • Assignee:
                psergey Sergei Petrunia
                Reporter:
                psergey Sergei Petrunia
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: