Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5161

Wrong result (missing rows) with semijoin, LEFT JOIN, ORDER BY, constant table

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 10.0.4, 5.3.12, 5.5.33a
    • Fix Version/s: 5.5.35, 10.0.7
    • Component/s: None
    • Labels:
      None

      Description

      The test case is nearly identical to the one provided in the comment to MDEV-5059 (https://mariadb.atlassian.net/browse/MDEV-5059?focusedCommentId=35418&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-35418); the only difference is that table t3 lost the index, and the wrong result now affects the default join_cache_level=2.

      Test case:

      SET optimizer_switch = 'semijoin=on';
      
      CREATE TABLE t1 (pk INT PRIMARY KEY, c1 VARCHAR(1)) ENGINE=MyISAM;
      INSERT INTO t1 VALUES (1,'v'),(2,'v'),(3,'c'),(4,NULL),(5,'x');
      
      CREATE TABLE t2 (c2 VARCHAR(1)) ENGINE=MyISAM;
      INSERT INTO t2 VALUES ('x');
      
      CREATE TABLE t3 (c3 VARCHAR(1)) ENGINE=MyISAM;
      INSERT INTO t3 VALUES ('x'),('d');
      
      SELECT * FROM t1, t2 WHERE pk IN ( SELECT pk FROM t1 LEFT JOIN t3 ON (c1 = c3 ) ) ORDER BY c2, c1;
      

      Actual result:

      pk	c1	c2
      4	NULL	x
      

      Expected result:

      pk	c1	c2
      4	NULL	x
      3	c	x
      1	v	x
      2	v	x
      5	x	x
      
      revision-id: bar@mnogosearch.org-20131016141313-63jt6geakz1e6dg2
      revno: 3708
      branch-nick: 5.3
      
      EXPLAIN EXTENDED
      SELECT * FROM t1, t2 WHERE pk IN ( SELECT pk FROM t1 LEFT JOIN t3 ON (c1 = c3 ) ) ORDER BY c2, c1;
      id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
      1	PRIMARY	t2	system	NULL	NULL	NULL	NULL	1	100.00	Using filesort
      1	PRIMARY	t1	ALL	PRIMARY	NULL	NULL	NULL	5	100.00	
      1	PRIMARY	t1	eq_ref	PRIMARY	PRIMARY	4	test.t1.pk	1	100.00	
      1	PRIMARY	t3	ALL	NULL	NULL	NULL	NULL	2	100.00	Using where; Start temporary; End temporary
      Warnings:
      Note	1003	select `test`.`t1`.`pk` AS `pk`,`test`.`t1`.`c1` AS `c1`,'x' AS `c2` from `test`.`t1` semi join (`test`.`t3`) join `test`.`t1` join `test`.`t2` where (`test`.`t1`.`pk` = `test`.`t1`.`pk`) order by 'x',`test`.`t1`.`c1`
      

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

              Hide
              psergey Sergei Petrunia added a comment -

              Now I have a patch for MDEV-5293, continuing to work on this bug.

              Show
              psergey Sergei Petrunia added a comment - Now I have a patch for MDEV-5293 , continuing to work on this bug.
              Hide
              psergey Sergei Petrunia added a comment -

              Interesting.. there is code in setup_semijoin_dups_elimination() which disables "Using filesort":

              Fixed LP bug #901478.
              If the duplicate elimination strategy is used for a semi-join and potentially
              one of the block-based join algorithms can be employed to join the inner
              tables of the semi-join then sorting of the head (first non-constant) table
              for a query with ORDER BY / GROUP BY cannot be used.

              It seems, a problem similar to this one was encountered before?

              Show
              psergey Sergei Petrunia added a comment - Interesting.. there is code in setup_semijoin_dups_elimination() which disables "Using filesort": Fixed LP bug #901478. If the duplicate elimination strategy is used for a semi-join and potentially one of the block-based join algorithms can be employed to join the inner tables of the semi-join then sorting of the head (first non-constant) table for a query with ORDER BY / GROUP BY cannot be used. It seems, a problem similar to this one was encountered before?
              Hide
              psergey Sergei Petrunia added a comment -

              The fix looks like this:

              Make sure that possible sorting of rows from the head table
              is not to be employed.
              */
              if (join->get_sort_by_join_tab())

              { join->simple_order= 0; join->simple_group= 0; join->need_tmp= join->test_if_need_tmp_table(); }

              and it doesn't work for our testcase, because join->get_sort_by_join_tab()=NULL, which is caused by JOIN::sort_by_table being NULL.

              Show
              psergey Sergei Petrunia added a comment - The fix looks like this: Make sure that possible sorting of rows from the head table is not to be employed. */ if (join->get_sort_by_join_tab()) { join->simple_order= 0; join->simple_group= 0; join->need_tmp= join->test_if_need_tmp_table(); } and it doesn't work for our testcase, because join->get_sort_by_join_tab()=NULL, which is caused by JOIN::sort_by_table being NULL.
              Hide
              psergey Sergei Petrunia added a comment -

              I've made a fix that runs get_sort_by_table() after const. tables have been detected and read. This caused get_sort_by_join_tab() to return the first JOIN_TAB in this example.

              Tests pass, including valgrind.

              Show
              psergey Sergei Petrunia added a comment - I've made a fix that runs get_sort_by_table() after const. tables have been detected and read. This caused get_sort_by_join_tab() to return the first JOIN_TAB in this example. Tests pass, including valgrind.
              Show
              dbart Daniel Bartholomew added a comment - http://bazaar.launchpad.net/~maria-captains/maria/5.5/revision/3963.1.2

                People

                • Assignee:
                  psergey Sergei Petrunia
                  Reporter:
                  elenst Elena Stepanova
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  4 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved: