Details
Description
create table ten(a int);
insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
create table t1 (key1 int, col1 int, filler1 varchar(100), filler varchar(100), key(key1));
insert into t1
select A.a + B.a* 10 + C.a * 100,
A.a + B.a* 10 + C.a * 100,
'filler1-data', 'filler2-data'
from
ten A, ten B, ten C;
set histogram_size=100;
set use_stat_tables='preferably';
set optimizer_use_condition_selectivity=4;
analyze table t1 persistent for all;
Let's check the data distribution. First, we have a non-indexed condition with
50% selectivity.
explain extended select * from t1 where col1 < 500; +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+ | 1 | SIMPLE | t1 | ALL | NULL | NULL | NULL | NULL | 1000 | 50.50 | Using where | +------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
Second, "key1<10" produces a range access which selects 10/1000= 1/100th of the
table:
explain extended select * from t1 where key1<10 ; +------+-------------+-------+-------+---------------+------+---------+------+------+----------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +------+-------------+-------+-------+---------------+------+---------+------+------+----------+-----------------------+ | 1 | SIMPLE | t1 | range | key1 | key1 | 5 | NULL | 9 | 100.00 | Using index condition | +------+-------------+-------+-------+---------------+------+---------+------+------+----------+-----------------------+
Now, let's use both conditions:
explain extended select * from t1 where key1<10 and col1< 500; +------+-------------+-------+-------+---------------+------+---------+------+------+----------+------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +------+-------------+-------+-------+---------------+------+---------+------+------+----------+------------------------------------+ | 1 | SIMPLE | t1 | range | key1 | key1 | 5 | NULL | 9 | 50.50 | Using index condition; Using where | +------+-------------+-------+-------+---------------+------+---------+------+------+----------+------------------------------------+
EXPLAIN looks ok. However, in debugger, one can see:
Breakpoint 31, matching_candidates_in_table (... (gdb) fini Run till exit from #0 matching_candidates_in_table (... Value returned is $279 = 4.5445544554455441
4.5 records is 9 records * 0.5 selectivity. Ok so far.
Breakpoint 32, table_cond_selectivity (... (gdb) fini 0x000000000069e43e in best_extension_by_limited_search (... Value returned is $281 = 0.50495049504950495 (gdb) list 7597 double partial_join_cardinality= current_record_count * 7598 pushdown_cond_selectivity; (gdb) print partial_join_cardinality $283 = 2.294775022056661
Ooops. Selectivity=0.5 has been applied for the second time, and now we've got
9 * 0.5 * 0.5 = 2.25 for estimate.
Let's check if it matters..
create table t2 as select * from t1; explain extended select * from t1,t2 where t1.key1<10 and t1.col1< 500;
For the last query, put a breakpoint in best_access_path().
Breakpoint 33, best_access_path (join=0x7fffca5a4088, s=0x7fffca5a5038, remaining_tables=3, idx=0, disable_jbuf=false, record_count=1, pos=0x7fffca5a58b0, loose_scan_pos=0x7ffff7e9fac0) at /home/psergey/dev2/10.0/sql/sql_select.cc:5642 (gdb) p s->table->alias.Ptr $288 = 0x7fffca4215f0 "t1" (gdb) c Continuing. Breakpoint 33, best_access_path (join=0x7fffca5a4088, s=0x7fffca5a5360, remaining_tables=2, idx=1, disable_jbuf=false, record_count=2.294775022056661, pos=0x7fffca5a59b8, loose_scan_pos=0x7ffff7e9f8b0) at /home/psergey/dev2/10.0/sql/sql_select.cc:5642 (gdb) p s->table->alias.Ptr $289 = 0x7fffca5a7270 "t2"
See: table t2 has got record_count=2.29. Half of what should be.
Gliffy Diagrams
Attachments
Activity
- All
- Comments
- Work Log
- History
- Activity
- Transitions
So, matching_candidates_in_table() takes into account that
selectivity(t1.col1<500)=0.5. This is correct.
The wrong action is that table_cond_selectivity() returns 1.
The problem is in table_cond_selectivity(). It should return 1.