InnoDB uses non-covering index on bool column when estimate is 50% of the table

Description

When there is a tinyint column with 0/1 values and only handful of rows contain one of them, optimizer computes the rows extimate for using index on that column as 50% of the rows instead of 95%+ and moreover it actually uses the index instead of a table scan even when there are additional conditions on a non-indexed column (so PK ref access will be needed anyway).

Selecting the "rare" value gives much better estimate (and using index is the right choice in that case).

I tested it between 15k - 1M rows and did observe it in production on 3.5M table. It seemed to behave correctly (optimizer prefering tablescan) somewhere under 15k rows.

I tested it on 10.0.22 an 10.1.6 (debug build I had at hand). Not using histogram-based stats. set global innodb_stats_traditional=off; did not make a difference.

I reported this to MySQL too as I found 5.6 doing the same (did not test 5.7) - http://bugs.mysql.com/bug.php?id=79271

Testcase attached, my results here:

Environment

Ubuntu 12.04, Centos 6.6

Assignee

Sergei Petrunia

Reporter

Jiri Kavalik

Labels

Fix versions

Affects versions

Priority

Minor
Configure