Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5149

using index for group-by slower than using index scan if many unique values in index

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 5.5.33
    • Fix Version/s: 5.5
    • Component/s: None
    • Labels:
    • Environment:
      CentOS 6.4

      Description

      MariaDB [_]> SHOW CREATE TABLE t1;
      +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | Table | Create Table                                                                                                                                                     |
      +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | t1    | CREATE TABLE `t1` (
        `a` bigint(20) unsigned NOT NULL,
        `b` bigint(20) DEFAULT NULL,
        PRIMARY KEY (`a`),
        KEY `b` (`b`)
      ) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
      +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      1 row in set (0.00 sec)
      
      MariaDB [_]> SELECT COUNT(*), COUNT(DISTINCT b) FROM t1;
      +----------+-------------------+
      | COUNT(*) | COUNT(DISTINCT b) |
      +----------+-------------------+
      |  2267612 |           1974979 |
      +----------+-------------------+
      1 row in set (1.64 sec)
      
      MariaDB [_]> SELECT COUNT(DISTINCT b) FROM t1;
      +-------------------+
      | COUNT(DISTINCT b) |
      +-------------------+
      |           1974979 |
      +-------------------+
      1 row in set (3.73 sec)
      
      MariaDB [_]> EXPLAIN SELECT COUNT(DISTINCT b) FROM t1;
      +------+-------------+-------+-------+---------------+------+---------+------+---------+--------------------------+
      | id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows    | Extra                    |
      +------+-------------+-------+-------+---------------+------+---------+------+---------+--------------------------+
      |    1 | SIMPLE      | t1    | range | NULL          | b    | 9       | NULL | 2267951 | Using index for group-by |
      +------+-------------+-------+-------+---------------+------+---------+------+---------+--------------------------+
      1 row in set (0.00 sec)
      
      MariaDB [_]> SELECT COUNT(DISTINCT b+1) FROM t1;
      +---------------------+
      | COUNT(DISTINCT b+1) |
      +---------------------+
      |             1974979 |
      +---------------------+
      1 row in set (1.68 sec)
      
      MariaDB [_]> EXPLAIN SELECT COUNT(DISTINCT b+1) FROM t1;
      +------+-------------+-------+-------+---------------+------+---------+------+---------+-------------+
      | id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows    | Extra       |
      +------+-------------+-------+-------+---------------+------+---------+------+---------+-------------+
      |    1 | SIMPLE      | t1    | index | NULL          | b    | 9       | NULL | 2267950 | Using index |
      +------+-------------+-------+-------+---------------+------+---------+------+---------+-------------+
      1 row in set (0.00 sec)
      

      If the number of unique values is high, it seems a full index scan is faster than one using group-by.
      Maybe if the cardinality is high, using index for group-by shouldn't be used if the cpu cost is high to do the looser scan.

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            elenst Elena Stepanova added a comment -

            Assigned to Timour (for analysis and commenting) arbitrarily, please reassign if needed.

            Show
            elenst Elena Stepanova added a comment - Assigned to Timour (for analysis and commenting) arbitrarily, please reassign if needed.

              People

              • Assignee:
                Unassigned
                Reporter:
                flupps Tobias Asplund
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated: