Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-4622

Union of uncorrelated subqueries becomes dependant

    Details

    • Type: Bug
    • Status: Stalled
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 5.5.31
    • Fix Version/s: None
    • Component/s: None
    • Labels:
    • Environment:
      Probably all

      Description

      This query:

      select * from a where id in (select id from a)
      

      correctly materializes the subquery before running the outer query. However, this query:

      select * from a where id in (select id from a union all select id from a)
      

      Does not. EXPLAIN shows the UNION and the subqueries as DEPENDANT, which is wrong as they are obviously uncorrelated.

      The fix for this is to use derived tables:

      select * from a where id in (select * from (select id from a union all select id from a) dq)
      

      We have 2 problems here:
      1) Obviously the union is causing the subqueries to become dependant and it in itself is dependant
      2) Another problem is that without the union the select is using the index, with the union it is using where instead - a serious performance killer as well.

      Question is, is the derived tables fix the only workaround? And is this a bug or a limitation in the optimizer?

      Note: the a table is a simple table with only id primary key column. I have a real-life case where this problem is causing some really bad performance, and I created this case just to illustrate the problem.

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            elenst Elena Stepanova added a comment -

            Sergei,

            Could you please consult on this or reassign to somebody who can?

            -----------------

            Quick test (on ~50K rows in the table, the initial query ~6 times slower than the modified one):

            MariaDB [test]> explain extended select * from a where id in (select id from a union all select id from a);
            ------------------------------------------------------------------------------------------------------------

            id select_type table type possible_keys key key_len ref rows filtered Extra

            ------------------------------------------------------------------------------------------------------------

            1 PRIMARY a index NULL PRIMARY 4 NULL 49152 100.00 Using where; Using index
            2 DEPENDENT SUBQUERY a eq_ref PRIMARY PRIMARY 4 func 1 100.00 Using index
            3 DEPENDENT UNION a eq_ref PRIMARY PRIMARY 4 func 1 100.00 Using index
            NULL UNION RESULT <union2,3> ALL NULL NULL NULL NULL NULL NULL  

            ------------------------------------------------------------------------------------------------------------
            4 rows in set, 1 warning (0.00 sec)

            MariaDB [test]> show warnings;
            --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

            Level Code Message

            --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

            Note 1003 select `test`.`a`.`id` AS `id` from `test`.`a` where <expr_cache><`test`.`a`.`id`>(<in_optimizer>(`test`.`a`.`id`,<exists>(select `test`.`a`.`id` from `test`.`a` where (<cache>(`test`.`a`.`id`) = `test`.`a`.`id`) union all select `test`.`a`.`id` from `test`.`a` where (<cache>(`test`.`a`.`id`) = `test`.`a`.`id`))))

            --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
            1 row in set (0.00 sec)

            MariaDB [test]> explain extended select * from a where id in (select * from (select id from a union all select id from a) dq);
            -----------------------------------------------------------------------------------------------

            id select_type table type possible_keys key key_len ref rows filtered Extra

            -----------------------------------------------------------------------------------------------

            1 PRIMARY a index PRIMARY PRIMARY 4 NULL 49152 100.00 Using index
            1 PRIMARY <subquery2> eq_ref distinct_key distinct_key 4 func 1 100.00  
            2 MATERIALIZED <derived3> ALL NULL NULL NULL NULL 98304 100.00  
            3 DERIVED a index NULL PRIMARY 4 NULL 49152 100.00 Using index
            4 UNION a index NULL PRIMARY 4 NULL 49152 100.00 Using index
            NULL UNION RESULT <union3,4> ALL NULL NULL NULL NULL NULL NULL  

            -----------------------------------------------------------------------------------------------
            6 rows in set, 1 warning (0.01 sec)

            MariaDB [test]> show warnings;
            --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

            Level Code Message

            --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

            Note 1003 select `test`.`a`.`id` AS `id` from `test`.`a` semi join ((select `test`.`a`.`id` AS `id` from `test`.`a` union all select `test`.`a`.`id` AS `id` from `test`.`a`) `dq`) where 1

            --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
            1 row in set (0.00 sec)

            MariaDB [test]> pager cat > /dev/null;
            PAGER set to 'cat > /dev/null'
            MariaDB [test]> select * from a where id in (select id from a union all select id from a);
            49152 rows in set (3.20 sec)

            MariaDB [test]> select * from a where id in (select * from (select id from a union all select id from a) dq);
            49152 rows in set (0.64 sec)

            MariaDB [test]> select * from a where id in (select id from a union all select id from a);
            49152 rows in set (3.23 sec)

            MariaDB [test]> select * from a where id in (select * from (select id from a union all select id from a) dq);
            49152 rows in set (0.65 sec)

            MariaDB [test]> pager;
            Default pager wasn't set, using stdout.
            MariaDB [test]> show create table a;
            --------------------------------------------------------------------------------------------------------------------------------------------------+

            Table Create Table

            --------------------------------------------------------------------------------------------------------------------------------------------------+

            a CREATE TABLE `a` (
            `id` int(11) NOT NULL AUTO_INCREMENT,
            PRIMARY KEY (`id`)
            ) ENGINE=MyISAM AUTO_INCREMENT=49153 DEFAULT CHARSET=latin1

            --------------------------------------------------------------------------------------------------------------------------------------------------+
            1 row in set (0.01 sec)

            MariaDB [test]> select count from a;
            ----------

            count

            ----------

            49152

            ----------
            1 row in set (0.00 sec)

            MariaDB [test]> select @@version;
            ----------------------

            @@version

            ----------------------

            5.5.31-MariaDB-debug

            ----------------------
            1 row in set (0.00 sec)

            Show
            elenst Elena Stepanova added a comment - Sergei, Could you please consult on this or reassign to somebody who can? ----------------- Quick test (on ~50K rows in the table, the initial query ~6 times slower than the modified one): MariaDB [test] > explain extended select * from a where id in (select id from a union all select id from a); ----- ------------------ ---------- ------ ------------- ------- ------- ---- ----- -------- ------------------------- id select_type table type possible_keys key key_len ref rows filtered Extra ----- ------------------ ---------- ------ ------------- ------- ------- ---- ----- -------- ------------------------- 1 PRIMARY a index NULL PRIMARY 4 NULL 49152 100.00 Using where; Using index 2 DEPENDENT SUBQUERY a eq_ref PRIMARY PRIMARY 4 func 1 100.00 Using index 3 DEPENDENT UNION a eq_ref PRIMARY PRIMARY 4 func 1 100.00 Using index NULL UNION RESULT <union2,3> ALL NULL NULL NULL NULL NULL NULL   ----- ------------------ ---------- ------ ------------- ------- ------- ---- ----- -------- ------------------------- 4 rows in set, 1 warning (0.00 sec) MariaDB [test] > show warnings; ------ ---- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Level Code Message ------ ---- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Note 1003 select `test`.`a`.`id` AS `id` from `test`.`a` where <expr_cache><`test`.`a`.`id`>(<in_optimizer>(`test`.`a`.`id`,<exists>(select `test`.`a`.`id` from `test`.`a` where (<cache>(`test`.`a`.`id`) = `test`.`a`.`id`) union all select `test`.`a`.`id` from `test`.`a` where (<cache>(`test`.`a`.`id`) = `test`.`a`.`id`)))) ------ ---- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 1 row in set (0.00 sec) MariaDB [test] > explain extended select * from a where id in (select * from (select id from a union all select id from a) dq); ----- ------------ ----------- ------ ------------- ------------ ------- ---- ----- -------- ------------ id select_type table type possible_keys key key_len ref rows filtered Extra ----- ------------ ----------- ------ ------------- ------------ ------- ---- ----- -------- ------------ 1 PRIMARY a index PRIMARY PRIMARY 4 NULL 49152 100.00 Using index 1 PRIMARY <subquery2> eq_ref distinct_key distinct_key 4 func 1 100.00   2 MATERIALIZED <derived3> ALL NULL NULL NULL NULL 98304 100.00   3 DERIVED a index NULL PRIMARY 4 NULL 49152 100.00 Using index 4 UNION a index NULL PRIMARY 4 NULL 49152 100.00 Using index NULL UNION RESULT <union3,4> ALL NULL NULL NULL NULL NULL NULL   ----- ------------ ----------- ------ ------------- ------------ ------- ---- ----- -------- ------------ 6 rows in set, 1 warning (0.01 sec) MariaDB [test] > show warnings; ------ ---- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Level Code Message ------ ---- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Note 1003 select `test`.`a`.`id` AS `id` from `test`.`a` semi join ((select `test`.`a`.`id` AS `id` from `test`.`a` union all select `test`.`a`.`id` AS `id` from `test`.`a`) `dq`) where 1 ------ ---- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 1 row in set (0.00 sec) MariaDB [test] > pager cat > /dev/null; PAGER set to 'cat > /dev/null' MariaDB [test] > select * from a where id in (select id from a union all select id from a); 49152 rows in set (3.20 sec) MariaDB [test] > select * from a where id in (select * from (select id from a union all select id from a) dq); 49152 rows in set (0.64 sec) MariaDB [test] > select * from a where id in (select id from a union all select id from a); 49152 rows in set (3.23 sec) MariaDB [test] > select * from a where id in (select * from (select id from a union all select id from a) dq); 49152 rows in set (0.65 sec) MariaDB [test] > pager; Default pager wasn't set, using stdout. MariaDB [test] > show create table a; ------ --------------------------------------------------------------------------------------------------------------------------------------------+ Table Create Table ------ --------------------------------------------------------------------------------------------------------------------------------------------+ a CREATE TABLE `a` ( `id` int(11) NOT NULL AUTO_INCREMENT, PRIMARY KEY (`id`) ) ENGINE=MyISAM AUTO_INCREMENT=49153 DEFAULT CHARSET=latin1 ------ --------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.01 sec) MariaDB [test] > select count from a; ---------- count ---------- 49152 ---------- 1 row in set (0.00 sec) MariaDB [test] > select @@version; ---------------------- @@version ---------------------- 5.5.31-MariaDB-debug ---------------------- 1 row in set (0.00 sec)
            Hide
            qiui Piotr Blasiak added a comment -

            After further testing I realized I could make the outer query use indexes by wrapping the subqueries themselves in the "select * from (...) alias" derived table fix:

            explain select * from a where id in (select * from (select * from (select id from a) dq1 union all select * from (select id from a) dq2) dq3)

            Show
            qiui Piotr Blasiak added a comment - After further testing I realized I could make the outer query use indexes by wrapping the subqueries themselves in the "select * from (...) alias" derived table fix: explain select * from a where id in (select * from (select * from (select id from a) dq1 union all select * from (select id from a) dq2) dq3)
            Hide
            psergey Sergei Petrunia added a comment -

            Looked at the code.

            About Materialization:
            The code considers Materialization (and/or SJ-Materialization) strategy only
            when the subquery is a single SELECT (i.e. not a UNION). This is a limitation
            in the optimizer.

            When Materialialization is not applicable, the only remaining strategy for an
            IN subquery is the IN->EXISTS transformation. IN->EXISTS transformation pushes
            IN-equality down into the subquery which makes the subquery "DEPENDENT".

            Pushing down IN-equality allows parts of subquery to use index

            MariaDB [j6]> explain select * from a where id in (select id from a union select id from a);
            ----------------------------------------------------------------------------------------------------+

            id select_type table type possible_keys key key_len ref rows Extra

            ----------------------------------------------------------------------------------------------------+

            1 PRIMARY a index NULL PRIMARY 4 NULL 51200 Using where; Using index
            2 DEPENDENT SUBQUERY a eq_ref PRIMARY PRIMARY 4 func 1 Using index
            3 DEPENDENT UNION a eq_ref PRIMARY PRIMARY 4 func 1 Using index
            NULL UNION RESULT <union2,3> ALL NULL NULL NULL NULL NULL  

            ----------------------------------------------------------------------------------------------------+
            4 rows in set (0.00 sec)

            Note the "eq_ref" in line #2 and line #3. It means index is used inside the UNION.

            Show
            psergey Sergei Petrunia added a comment - Looked at the code. About Materialization: The code considers Materialization (and/or SJ-Materialization) strategy only when the subquery is a single SELECT (i.e. not a UNION). This is a limitation in the optimizer. When Materialialization is not applicable, the only remaining strategy for an IN subquery is the IN->EXISTS transformation. IN->EXISTS transformation pushes IN-equality down into the subquery which makes the subquery "DEPENDENT". Pushing down IN-equality allows parts of subquery to use index MariaDB [j6] > explain select * from a where id in (select id from a union select id from a); ----- ------------------ ---------- ------ ------------- ------- ------- ---- ----- -------------------------+ id select_type table type possible_keys key key_len ref rows Extra ----- ------------------ ---------- ------ ------------- ------- ------- ---- ----- -------------------------+ 1 PRIMARY a index NULL PRIMARY 4 NULL 51200 Using where; Using index 2 DEPENDENT SUBQUERY a eq_ref PRIMARY PRIMARY 4 func 1 Using index 3 DEPENDENT UNION a eq_ref PRIMARY PRIMARY 4 func 1 Using index NULL UNION RESULT <union2,3> ALL NULL NULL NULL NULL NULL   ----- ------------------ ---------- ------ ------------- ------- ------- ---- ----- -------------------------+ 4 rows in set (0.00 sec) Note the "eq_ref" in line #2 and line #3. It means index is used inside the UNION.
            Hide
            psergey Sergei Petrunia added a comment -

            " 2) Another problem is that without the union the select is using the index, with the union it is using where instead - a serious performance killer as well. "

            I'm unable to observe this. Piotr, can you post a query where that happens (both the variant with UNION and without), and EXPLAIN outputs?

            Show
            psergey Sergei Petrunia added a comment - " 2) Another problem is that without the union the select is using the index, with the union it is using where instead - a serious performance killer as well. " I'm unable to observe this. Piotr, can you post a query where that happens (both the variant with UNION and without), and EXPLAIN outputs?

              People

              • Assignee:
                psergey Sergei Petrunia
                Reporter:
                qiui Piotr Blasiak
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated: