Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-2608

LP:795530 - Wrong result with subquery semijoin materialization and outer join

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 5.3.12
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      The following query over the 'world' database incorrectly produces wrong result
      of 1 row instead of an empty result:

      create database world;
      use world;
      source mysql-test/include/world_schema.inc
      source mysql-test/include/world.inc

      set @@optimizer_switch='materialization=on,semijoin=on';

      MariaDB [world]> select * from CountryLanguage where (Language, Country) IN (SELECT City.Name, Country.Code FROM City LEFT JOIN Country ON (Country = Code and Code > 'LLL'));
      ---------------------------

      Country Language Percentage

      ---------------------------

      KEN Meru 5.5

      ---------------------------
      1 row in set (0.22 sec)

      set @@optimizer_switch='materialization=on,semijoin=off';
      => empty result;

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            timour Timour Katchaounov added a comment -

            Re: Wrong result with subquery semijoin materialization and outer join
            In addition, running the above query with semijoin materialization is twice slower
            than with non-semijoin materialization.

            set @@optimizer_switch='materialization=on,semijoin=off';

            MariaDB [world]> select * from CountryLanguage where (Language, Country) IN (SELECT City.Name, Country.Code FROM City LEFT JOIN Country ON (Country = Code and Code > 'LLL'));

            Empty set (0.10 sec)

            Notice 0.1 sec vs 0.22 sec, averaged over several executions.

            Show
            timour Timour Katchaounov added a comment - Re: Wrong result with subquery semijoin materialization and outer join In addition, running the above query with semijoin materialization is twice slower than with non-semijoin materialization. set @@optimizer_switch='materialization=on,semijoin=off'; MariaDB [world] > select * from CountryLanguage where (Language, Country) IN (SELECT City.Name, Country.Code FROM City LEFT JOIN Country ON (Country = Code and Code > 'LLL')); Empty set (0.10 sec) Notice 0.1 sec vs 0.22 sec, averaged over several executions.
            Hide
            psergey Sergei Petrunia added a comment -

            Re: Wrong result with subquery semijoin materialization and outer join
            A smaller testcase:

            create table t1 (a int);
            create table t2 (a int, b char(10));

            insert into t1 values (1),(2);
            insert into t2 values (1, 'one'), (3, 'three');

            create table t3 (b char(10));
            insert into t3 values('three'),( 'four');
            insert into t3 values('three'),( 'four');
            insert into t3 values('three'),( 'four');
            insert into t3 values('three'),( 'four');

            MariaDB [j48]> set optimizer_switch='semijoin=off';
            Query OK, 0 rows affected (0.00 sec)

            MariaDB [j48]> select * from t3 where t3.b in (select t2.b from t1 left join t2 on t1.a=t2.a);
            Empty set (0.01 sec)

            MariaDB [j48]> set optimizer_switch='semijoin=on';
            Query OK, 0 rows affected (0.00 sec)

            MariaDB [j48]> select * from t3 where t3.b in (select t2.b from t1 left join t2 on t1.a=t2.a);
            -------

            b

            -------

            three
            three
            three
            three

            -------
            4 rows in set (0.00 sec)

            Show
            psergey Sergei Petrunia added a comment - Re: Wrong result with subquery semijoin materialization and outer join A smaller testcase: create table t1 (a int); create table t2 (a int, b char(10)); insert into t1 values (1),(2); insert into t2 values (1, 'one'), (3, 'three'); create table t3 (b char(10)); insert into t3 values('three'),( 'four'); insert into t3 values('three'),( 'four'); insert into t3 values('three'),( 'four'); insert into t3 values('three'),( 'four'); MariaDB [j48] > set optimizer_switch='semijoin=off'; Query OK, 0 rows affected (0.00 sec) MariaDB [j48] > select * from t3 where t3.b in (select t2.b from t1 left join t2 on t1.a=t2.a); Empty set (0.01 sec) MariaDB [j48] > set optimizer_switch='semijoin=on'; Query OK, 0 rows affected (0.00 sec) MariaDB [j48] > select * from t3 where t3.b in (select t2.b from t1 left join t2 on t1.a=t2.a); ------- b ------- three three three three ------- 4 rows in set (0.00 sec)
            Hide
            psergey Sergei Petrunia added a comment -

            Re: Wrong result with subquery semijoin materialization and outer join
            The problem seems to be in SJ-Merged materialization. For some reason, end_sj_materialize() fails to see that it is writing columns of NULL-comlemented rows, and writes some (previous?) non-NULL data instead.

            Show
            psergey Sergei Petrunia added a comment - Re: Wrong result with subquery semijoin materialization and outer join The problem seems to be in SJ-Merged materialization. For some reason, end_sj_materialize() fails to see that it is writing columns of NULL-comlemented rows, and writes some (previous?) non-NULL data instead.
            Hide
            psergey Sergei Petrunia added a comment -

            Re: Wrong result with subquery semijoin materialization and outer join
            The wrong query result problem has been fixed, and the fix has been pushed.

            Show
            psergey Sergei Petrunia added a comment - Re: Wrong result with subquery semijoin materialization and outer join The wrong query result problem has been fixed, and the fix has been pushed.
            Hide
            psergey Sergei Petrunia added a comment -

            Re: Wrong result with subquery semijoin materialization and outer join
            As for performance:

            The difference between performance of semi-join and non-semi-join materialization plans is because
            non-semijoin materialization does materialization+lookup (its only strategy):

            ----------------------------------------------------------------------------------------------------------+

            id select_type table type possible_keys key key_len ref rows Extra

            ----------------------------------------------------------------------------------------------------------+

            1 PRIMARY CountryLanguage ALL NULL NULL NULL NULL 984 Using where
            2 SUBQUERY City ALL NULL NULL NULL NULL 4079  
            2 SUBQUERY Country eq_ref PRIMARY PRIMARY 3 j4.City.Country 1 Using where; Using index

            ----------------------------------------------------------------------------------------------------------+

            while semi-join materialization does materialization+scan

            -----------------------------------------------------------------------------------------------------------------------+

            id select_type table type possible_keys key key_len ref rows Extra

            -----------------------------------------------------------------------------------------------------------------------+

            1 PRIMARY <subquery2> ALL distinct_key NULL NULL NULL 4079  
            1 PRIMARY CountryLanguage eq_ref PRIMARY PRIMARY 33 j4.Country.Code,j4.City.Name 1 Using index condition
            2 SUBQUERY City ALL NULL NULL NULL NULL 4079  
            2 SUBQUERY Country eq_ref PRIMARY PRIMARY 3 j4.City.Country 1 Using where; Using index

            -----------------------------------------------------------------------------------------------------------------------+

            Show
            psergey Sergei Petrunia added a comment - Re: Wrong result with subquery semijoin materialization and outer join As for performance: The difference between performance of semi-join and non-semi-join materialization plans is because non-semijoin materialization does materialization+lookup (its only strategy): --- ----------- --------------- ------ ------------- ------- ------- --------------- ---- -------------------------+ id select_type table type possible_keys key key_len ref rows Extra --- ----------- --------------- ------ ------------- ------- ------- --------------- ---- -------------------------+ 1 PRIMARY CountryLanguage ALL NULL NULL NULL NULL 984 Using where 2 SUBQUERY City ALL NULL NULL NULL NULL 4079   2 SUBQUERY Country eq_ref PRIMARY PRIMARY 3 j4.City.Country 1 Using where; Using index --- ----------- --------------- ------ ------------- ------- ------- --------------- ---- -------------------------+ while semi-join materialization does materialization+scan --- ----------- --------------- ------ ------------- ------- ------- ---------------------------- ---- -------------------------+ id select_type table type possible_keys key key_len ref rows Extra --- ----------- --------------- ------ ------------- ------- ------- ---------------------------- ---- -------------------------+ 1 PRIMARY <subquery2> ALL distinct_key NULL NULL NULL 4079   1 PRIMARY CountryLanguage eq_ref PRIMARY PRIMARY 33 j4.Country.Code,j4.City.Name 1 Using index condition 2 SUBQUERY City ALL NULL NULL NULL NULL 4079   2 SUBQUERY Country eq_ref PRIMARY PRIMARY 3 j4.City.Country 1 Using where; Using index --- ----------- --------------- ------ ------------- ------- ------- ---------------------------- ---- -------------------------+
            Hide
            psergey Sergei Petrunia added a comment -

            Re: Wrong result with subquery semijoin materialization and outer join
            I'm getting the following average query times (in seconds):
            semi-join materialization, mean: 0.1042105
            non-semi-join materialzation, mean: 0.1852632

            Show
            psergey Sergei Petrunia added a comment - Re: Wrong result with subquery semijoin materialization and outer join I'm getting the following average query times (in seconds): semi-join materialization, mean: 0.1042105 non-semi-join materialzation, mean: 0.1852632
            Hide
            psergey Sergei Petrunia added a comment -

            Re: Wrong result with subquery semijoin materialization and outer join
            The slowdown will be addressed separately here: https://bugs.launchpad.net/maria/+bug/806894

            Show
            psergey Sergei Petrunia added a comment - Re: Wrong result with subquery semijoin materialization and outer join The slowdown will be addressed separately here: https://bugs.launchpad.net/maria/+bug/806894
            Hide
            ratzpo Rasmus Johansson added a comment -

            Launchpad bug id: 795530

            Show
            ratzpo Rasmus Johansson added a comment - Launchpad bug id: 795530

              People

              • Assignee:
                psergey Sergei Petrunia
                Reporter:
                timour Timour Katchaounov
              • Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: