Details
Description
Create a dataset:
create table ten(a int); insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9); create table one_k(a int); insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C; create table one_k2 as select * from one_k; set histogram_size=100; set use_stat_tables='preferably'; set optimizer_use_condition_selectivity=4; analyze table one_k persistent for all; analyze table one_k2 persistent for all;
Let's see what histograms give us
explain extended select * from one_k A where A.a < 40; +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+ | 1 | SIMPLE | A | ALL | NULL | NULL | NULL | NULL | 1000 | 4.95 | Using where | +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
explain extended select * from one_k2 B where B.a < 100; +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+ | 1 | SIMPLE | B | ALL | NULL | NULL | NULL | NULL | 1000 | 9.90 | Using where | +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
The numbers looks ok.
Now, let's try a join:
explain extended select * from one_k A, one_k2 B where A.a < 40 and B.a < 100; +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+ | 1 | SIMPLE | A | ALL | NULL | NULL | NULL | NULL | 1000 | 4.95 | Using where | | 1 | SIMPLE | B | ALL | NULL | NULL | NULL | NULL | 1000 | 9.90 | Using where; Using join buffer (flat, BNL join) | +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
Looks ok. Now, let's add a condition.
explain extended select * from one_k A, one_k2 B where A.a < 40 and B.a < 100 and B.a=A.a; +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+ | 1 | SIMPLE | A | ALL | NULL | NULL | NULL | NULL | 1000 | 4.95 | Using where | | 1 | SIMPLE | B | ALL | NULL | NULL | NULL | NULL | 1000 | 100.00 | Using where; Using join buffer (flat, BNL join) | +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------+
And filtered% becomes 100%. It used to be 9.90%, we have added another condition into WHERE and now the optimizer expects the condition to be less selective! This looks wrong.
Gliffy Diagrams
Attachments
Issue Links
- relates to
-
MDEV-6325 EITS: wrong selectivity for REF(t1.key=t0.col) AND COND(t0.col)
-
- Closed
-
Activity
- All
- Comments
- Work Log
- History
- Activity
- Transitions
Look at the code near this comment:
If the field f from the table is equal to a field from one the earlier joined tables then the selectivity of the range conditions over the field f must be discounted.Maybe, this is the cause of this bug?