Details
-
Type:
Bug
-
Status: Closed
-
Priority:
Major
-
Resolution: Fixed
-
Affects Version/s: None
-
Fix Version/s: 5.5.25
-
Component/s: None
-
Labels:None
Description
The following query
SELECT (
SELECT MIN(b) FROM t1, t2
WHERE b = a AND
( b = alias1.b OR
EXISTS ( SELECT * FROM t3 )
)
)
FROM t2 alias1, t1 alias2, t1 alias3;
on my machine takes 80 sec and more on the work tree vs 0.5 sec and less on the main 5.5 tree (tried revno 3413 and revno 3402).
bzr version-info
revision-id: timour@askmonty.org-20120518115201-s6byggvesqxcntkd date: 2012-05-18 14:52:01 +0300 revno: 3404
Reproducible with the default optimizer_switch as well as with all OFF values (except for in_to_exists which is required to execute the query).
Reproducible with MyISAM, Aria, InnoDB.
EXPLAIN on the work tree (with the default optimizer_switch):
id select_type table type possible_keys key key_len ref rows filtered Extra 1 PRIMARY alias2 ALL NULL NULL NULL NULL 20 100.00 1 PRIMARY alias3 ALL NULL NULL NULL NULL 20 100.00 Using join buffer (flat, BNL join) 1 PRIMARY alias1 ALL NULL NULL NULL NULL 100 100.00 Using join buffer (incremental, BNL join) 2 DEPENDENT SUBQUERY t1 ALL NULL NULL NULL NULL 20 100.00 2 DEPENDENT SUBQUERY t2 ALL NULL NULL NULL NULL 100 100.00 Using where; Using join buffer (flat, BNL join) 3 SUBQUERY t3 ALL NULL NULL NULL NULL 2 100.00 Warnings: Note 1276 Field or reference 'test.alias1.b' of SELECT #2 was resolved in SELECT #1 Note 1003 select <expr_cache><>((select min(`test`.`t2`.`b`) from `test`.`t1` join `test`.`t2` where (`test`.`t2`.`b` = `test`.`t1`.`a`))) AS `( SELECT MIN(b) FROM t1, t2 WHERE b = a AND ( b = alias1.b OR EXISTS ( SELECT * FROM t3 ) ) )` from `test`.`t2` `alias1` join `test`.`t1` `alias2` join `test`.`t1` `alias3`
EXPLAIN on the main tree (with the default optimizer_switch):
id select_type table type possible_keys key key_len ref rows filtered Extra 1 PRIMARY alias2 ALL NULL NULL NULL NULL 20 100.00 1 PRIMARY alias3 ALL NULL NULL NULL NULL 20 100.00 Using join buffer (flat, BNL join) 1 PRIMARY alias1 ALL NULL NULL NULL NULL 100 100.00 Using join buffer (incremental, BNL join) 2 DEPENDENT SUBQUERY t1 ALL NULL NULL NULL NULL 20 100.00 Using where 2 DEPENDENT SUBQUERY t2 ALL NULL NULL NULL NULL 100 100.00 Using where; Using join buffer (flat, BNL join) 3 SUBQUERY t3 ALL NULL NULL NULL NULL 2 100.00 Warnings: Note 1276 Field or reference 'test.alias1.b' of SELECT #2 was resolved in SELECT #1 Note 1003 select <expr_cache><`test`.`alias1`.`b`>((select min(`test`.`t2`.`b`) from `test`.`t1` join `test`.`t2` where ((`test`.`t2`.`b` = `test`.`t1`.`a`) and ((`test`.`t1`.`a` = `test`.`alias1`.`b`) or exists(select 1 from `test`.`t3`))))) AS `( SELECT MIN(b) FROM t1, t2 WHERE b = a AND ( b = alias1.b OR EXISTS ( SELECT * FROM t3 ) ) )` from `test`.`t2` `alias1` join `test`.`t1` `alias2` join `test`.`t1` `alias3`
Test case:
CREATE TABLE t1 (a INT);
INSERT INTO t1 VALUES
(1),(7),(4),(7),(0),(2),(9),(4),(0),(9),
(1),(3),(8),(8),(18),(84),(6),(3),(6),(6);
CREATE TABLE t2 (b INT);
INSERT INTO t2 VALUES
(4),(5),(2),(5),(1),(1),(2),(2),(2),(197),
(4),(5),(3),(1),(2),(7),(6),(1),(156),(8),
(7),(2),(6),(2),(1),(0),(7),(5),(7),(2),(1),
(80),(3),(8),(5),(0),(9),(9),(7),(0),(5),
(6),(9),(3),(91),(6),(7),(3),(161),(7),(7),
(6),(5),(8),(7),(2),(1),(3),(6),(6),(5),(0),
(7),(7),(6),(0),(0),(8),(0),(4),(0),(213),
(248),(1),(6),(6),(3),(140),(0),(7),(6),(6),
(8),(5),(8),(7),(3),(7),(3),(8),(0),(1),(3),
(6),(8),(1),(1),(9),(0),(6);
CREATE TABLE t3 (c INT);
INSERT INTO t3 VALUES (8),(3);
SELECT (
SELECT MIN(b) FROM t1, t2
WHERE b = a AND
( b = alias1.b OR
EXISTS ( SELECT * FROM t3 )
)
)
FROM t2 alias1, t1 alias2, t1 alias3;
Gliffy Diagrams
Attachments
Issue Links
Activity
- All
- Comments
- Work Log
- History
- Activity
- Transitions
Analysis:
The much slower execution in the case when constant subqueries are evaluated is a result of the subquery cache not working in this case. This is what happens in more detail. The subquery in the SELECT list contains a subquery. This subquery in turn contains a constant subquery.
The ultimate result is that once the subquery becomes non-correlated, it is not cached any more, and is executed for each result row combination (40K times in the example). I consider the above behavior a deficiency of the subquery cache. This is being investigated ATM.