Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-7982

EXPLAIN shows meaningless query plan with Distinct

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 10.0.17, 5.3.13, 10.1, 5.5
    • Fix Version/s: 10.1, 10.0
    • Component/s: Optimizer
    • Labels:

      Description

      Take the attached dataset and run this query:

      explain  select distinct count(*) from t2 where a between 10 and 20 group by a;
      +------+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------------------------------------+
      | id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                                                               |
      +------+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------------------------------------+
      |    1 | SIMPLE      | t2    | range | a             | a    | 5       | NULL |   10 | Using where; Using index; Distinct; Using temporary; Using filesort |
      +------+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------------------------------------+
      

      and see an apparently non-sensical output (how can a table use "Distinct" if it
      is the only table in the join?)

      This is repeatable on 10.0.17, didn't check 5.5

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            psergey Sergei Petrunia added a comment -

            Debugging on 10.0, looking at this line of code in JOIN::save_explain_data_intern:

            	if (distinct & test_all_bits(used_tables,
                                                 join->select_list_used_tables))
            

            select_list_used_tables==0 when we come here... which is weird, JOIN::optimize_inner does call JOIN::eval_select_list_used_tables which should have put a correct value there.

            in eval_select_list_used_tables we have:

            (gdb) p item
              $11 = (Item *) 0x7fffa4006618
            (gdb) p dbug_print_item( item)
              $12 = 0x1842700 "count(0)"
            (gdb) p item->used_tables()
              $13 = 0
            

            One can debate what (count)->used_tables() should return, but this is not the point.

            The point is, if we run

              SELECT DISTINCT COUNT(*) 
            

            then it is obvious that Distinct optimization is not applicable.
            Short-circuiting logic applies when running

              SELECT DISTINCT func(table_column)
              

            but not when one runs

            SELECT DISTINCT aggregate_func.
            

            (unless maybe some cases with Loose Scan).

            Show
            psergey Sergei Petrunia added a comment - Debugging on 10.0, looking at this line of code in JOIN::save_explain_data_intern: if (distinct & test_all_bits(used_tables, join->select_list_used_tables)) select_list_used_tables==0 when we come here... which is weird, JOIN::optimize_inner does call JOIN::eval_select_list_used_tables which should have put a correct value there. in eval_select_list_used_tables we have: (gdb) p item $11 = (Item *) 0x7fffa4006618 (gdb) p dbug_print_item( item) $12 = 0x1842700 "count(0)" (gdb) p item->used_tables() $13 = 0 One can debate what (count )->used_tables() should return, but this is not the point. The point is, if we run SELECT DISTINCT COUNT(*) then it is obvious that Distinct optimization is not applicable. Short-circuiting logic applies when running SELECT DISTINCT func(table_column) but not when one runs SELECT DISTINCT aggregate_func. (unless maybe some cases with Loose Scan).
            Hide
            elenst Elena Stepanova added a comment -

            On 5.3 and 5.5, it looks like this:

            elenst@wheezy-64:~/git/5.5$ ml test -e "explain  select distinct count(*) from t2 where a between 10 and 20 group by a;"
            +------+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------------------------------------+
            | id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                                                               |
            +------+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------------------------------------+
            |    1 | SIMPLE      | t2    | range | a             | a    | 5       | NULL |   10 | Using where; Using index; Using temporary; Using filesort; Distinct |
            +------+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------------------------------------+
            
            Show
            elenst Elena Stepanova added a comment - On 5.3 and 5.5, it looks like this: elenst@wheezy-64:~/git/5.5$ ml test -e "explain select distinct count(*) from t2 where a between 10 and 20 group by a;" +------+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------------------------------------+ | 1 | SIMPLE | t2 | range | a | a | 5 | NULL | 10 | Using where ; Using index; Using temporary; Using filesort; Distinct | +------+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------------------------------------+

              People

              • Assignee:
                psergey Sergei Petrunia
                Reporter:
                psergey Sergei Petrunia
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated: