Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-7240

Cannot use "group by" with "left join"

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 10.0.13
    • Fix Version/s: 10.0.16
    • Component/s: Optimizer
    • Labels:
      None
    • Environment:
      CentOS release 6.5 (Final)

      Description

      I get the wrong result when I join 2 tables and use group by at the same time.
      How I can fix this problem?

      MariaDB [workspace]> DROP TABLE IF EXISTS t1;
      Query OK, 0 rows affected (0.00 sec)
      
      MariaDB [workspace]> DROP TABLE IF EXISTS t2;
      Query OK, 0 rows affected (0.00 sec)
      
      MariaDB [workspace]> CREATE TABLE t1(id CHAR(16), PRIMARY KEY(id));
      Query OK, 0 rows affected (0.02 sec)
      
      MariaDB [workspace]> INSERT INTO t1 VALUES ('78622');
      Query OK, 1 row affected (0.00 sec)
      
      MariaDB [workspace]> INSERT INTO t1 VALUES ('786220');
      Query OK, 1 row affected (0.00 sec)
      
      MariaDB [workspace]> CREATE TABLE t2(id CHAR(16), VAR INT DEFAULT NULL, PRIMARY KEY(id));
      Query OK, 0 rows affected (0.04 sec)
      
      MariaDB [workspace]> INSERT INTO t2 VALUES ('78622',1);
      Query OK, 1 row affected (0.00 sec)
      
      MariaDB [workspace]> SELECT id, VAR FROM t1 LEFT JOIN t2 USING(id) GROUP BY id;
      +--------+------+
      | id     | VAR  |
      +--------+------+
      | 78622  | NULL |
      | 786220 | NULL |
      +--------+------+
      2 rows in set (0.00 sec)
      
      MariaDB [workspace]> SELECT id, VAR FROM t1 LEFT JOIN t2 USING(id);
      +--------+------+
      | id     | VAR  |
      +--------+------+
      | 78622  |    1 |
      | 786220 | NULL |
      +--------+------+
      2 rows in set (0.00 sec)
      

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

              Hide
              elenst Elena Stepanova added a comment - - edited

              Hi,

              Thanks for the report. It's likely to be a duplicate of MDEV-5719, But I'll assign it to Sergei Petrunia so he could check your test case as well after the bugfix (never hurts to double-check).

              Regarding your question how to fix the problem, it somewhat depends on what your real data and structures are (I presume you submtted a simplified test case, and it's great, thanks for that!)
              In general, a query which uses partial GROUP BY is nonndeterministic. If there were more than one matching row in t2, the returned value of VAR would have been unpredictable, and then the advice would have depended on what your expectations are. But if it's a PK on t2 like in your test case, you'll always get no more than one matching row; so it should be safe just to add VAR to GROUP BY, thus making it more reliable and getting rid of the wrong result, all at once:

              > SELECT id, VAR FROM t1 LEFT JOIN t2 USING(id) GROUP BY id, VAR;
              +--------+------+
              | id     | VAR  |
              +--------+------+
              | 78622  |    1 |
              | 786220 | NULL |
              +--------+------+
              2 rows in set (0.00 sec)
              

              Although, since it's a PK on t1 too, I'm not sure if you need a GROUP BY at all.

              Show
              elenst Elena Stepanova added a comment - - edited Hi, Thanks for the report. It's likely to be a duplicate of MDEV-5719 , But I'll assign it to Sergei Petrunia so he could check your test case as well after the bugfix (never hurts to double-check). Regarding your question how to fix the problem, it somewhat depends on what your real data and structures are (I presume you submtted a simplified test case, and it's great, thanks for that!) In general, a query which uses partial GROUP BY is nonndeterministic. If there were more than one matching row in t2, the returned value of VAR would have been unpredictable, and then the advice would have depended on what your expectations are. But if it's a PK on t2 like in your test case, you'll always get no more than one matching row; so it should be safe just to add VAR to GROUP BY, thus making it more reliable and getting rid of the wrong result, all at once: > SELECT id, VAR FROM t1 LEFT JOIN t2 USING(id) GROUP BY id, VAR; +--------+------+ | id | VAR | +--------+------+ | 78622 | 1 | | 786220 | NULL | +--------+------+ 2 rows in set (0.00 sec) Although, since it's a PK on t1 too, I'm not sure if you need a GROUP BY at all.
              Hide
              takuya Takuya Aoki added a comment -

              Thank you for your comment,
              I tested the below cases and it seems their are condition when this bug occurs.

              RESULT:
              t1 t2 NULL_ERROR
              p p YES
              i p YES
              p i NO
              i i NO
              none none NO
              none p NO
              p none NO
              p p (GROUP BY id DESC) NO
              p p (INSERT INTO t2 VALUES ('786220',1); instead) NO
              
              DROP TABLE IF EXISTS t1;
              DROP TABLE IF EXISTS t2;
              CREATE TABLE t1(id CHAR(16), PRIMARY KEY(id));
              INSERT INTO t1 VALUES ('78622');
              INSERT INTO t1 VALUES ('786220');
              CREATE TABLE t2(id CHAR(16), VAR INT DEFAULT NULL, PRIMARY KEY(id));
              INSERT INTO t2 VALUES ('78622',1);
              SELECT id, VAR FROM t1 LEFT JOIN t2 USING(id) GROUP BY id;
              
              +--------+------+
              | id     | VAR  |
              +--------+------+
              | 78622  | NULL |
              | 786220 | NULL |
              +--------+------+
              
              DROP TABLE IF EXISTS t1;
              DROP TABLE IF EXISTS t2;
              CREATE TABLE t1(id CHAR(16), INDEX(id));
              INSERT INTO t1 VALUES ('78622');
              INSERT INTO t1 VALUES ('786220');
              CREATE TABLE t2(id CHAR(16), VAR INT DEFAULT NULL, PRIMARY KEY(id));
              INSERT INTO t2 VALUES ('78622',1);
              SELECT id, VAR FROM t1 LEFT JOIN t2 USING(id) GROUP BY id;
              
              +--------+------+
              | id     | VAR  |
              +--------+------+
              | 78622  | NULL |
              | 786220 | NULL |
              +--------+------+
              
              DROP TABLE IF EXISTS t1;
              DROP TABLE IF EXISTS t2;
              CREATE TABLE t1(id CHAR(16), PRIMARY KEY(id));
              INSERT INTO t1 VALUES ('78622');
              INSERT INTO t1 VALUES ('786220');
              CREATE TABLE t2(id CHAR(16), VAR INT DEFAULT NULL, INDEX(id));
              INSERT INTO t2 VALUES ('78622',1);
              SELECT id, VAR FROM t1 LEFT JOIN t2 USING(id) GROUP BY id;
              
              +--------+------+
              | id     | VAR  |
              +--------+------+
              | 78622  |    1 |
              | 786220 | NULL |
              +--------+------+
              
              DROP TABLE IF EXISTS t1;
              DROP TABLE IF EXISTS t2;
              CREATE TABLE t1(id CHAR(16), INDEX(id));
              INSERT INTO t1 VALUES ('78622');
              INSERT INTO t1 VALUES ('786220');
              CREATE TABLE t2(id CHAR(16), VAR INT DEFAULT NULL, INDEX(id));
              INSERT INTO t2 VALUES ('78622',1);
              SELECT id, VAR FROM t1 LEFT JOIN t2 USING(id) GROUP BY id;
              
              +--------+------+
              | id     | VAR  |
              +--------+------+
              | 78622  |    1 |
              | 786220 | NULL |
              +--------+------+
              
              DROP TABLE IF EXISTS t1;
              DROP TABLE IF EXISTS t2;
              CREATE TABLE t1(id CHAR(16));
              INSERT INTO t1 VALUES ('78622');
              INSERT INTO t1 VALUES ('786220');
              CREATE TABLE t2(id CHAR(16), VAR INT DEFAULT NULL);
              INSERT INTO t2 VALUES ('78622',1);
              SELECT id, VAR FROM t1 LEFT JOIN t2 USING(id) GROUP BY id;
              
              +--------+------+
              | id     | VAR  |
              +--------+------+
              | 78622  |    1 |
              | 786220 | NULL |
              +--------+------+
              
              DROP TABLE IF EXISTS t1;
              DROP TABLE IF EXISTS t2;
              CREATE TABLE t1(id CHAR(16));
              INSERT INTO t1 VALUES ('78622');
              INSERT INTO t1 VALUES ('786220');
              CREATE TABLE t2(id CHAR(16), VAR INT DEFAULT NULL, PRIMARY KEY(id));
              INSERT INTO t2 VALUES ('78622',1);
              SELECT id, VAR FROM t1 LEFT JOIN t2 USING(id) GROUP BY id;
              
              +--------+------+
              | id     | VAR  |
              +--------+------+
              | 78622  |    1 |
              | 786220 | NULL |
              +--------+------+
              
              DROP TABLE IF EXISTS t1;
              DROP TABLE IF EXISTS t2;
              CREATE TABLE t1(id CHAR(16), PRIMARY KEY(id));
              INSERT INTO t1 VALUES ('78622');
              INSERT INTO t1 VALUES ('786220');
              CREATE TABLE t2(id CHAR(16), VAR INT DEFAULT NULL);
              INSERT INTO t2 VALUES ('78622',1);
              SELECT id, VAR FROM t1 LEFT JOIN t2 USING(id) GROUP BY id;
              
              +--------+------+
              | id     | VAR  |
              +--------+------+
              | 78622  |    1 |
              | 786220 | NULL |
              +--------+------+
              
              DROP TABLE IF EXISTS t1;
              DROP TABLE IF EXISTS t2;
              CREATE TABLE t1(id CHAR(16), PRIMARY KEY(id));
              INSERT INTO t1 VALUES ('78622');
              INSERT INTO t1 VALUES ('786220');
              CREATE TABLE t2(id CHAR(16), VAR INT DEFAULT NULL, PRIMARY KEY(id));
              INSERT INTO t2 VALUES ('78622',1);
              SELECT id, VAR FROM t1 LEFT JOIN t2 USING(id) GROUP BY id DESC;
              
              +--------+------+
              | id     | VAR  |
              +--------+------+
              | 786220 | NULL |
              | 78622  |    1 |
              +--------+------+
              
              DROP TABLE IF EXISTS t1;
              DROP TABLE IF EXISTS t2;
              CREATE TABLE t1(id CHAR(16), PRIMARY KEY(id));
              INSERT INTO t1 VALUES ('78622');
              INSERT INTO t1 VALUES ('786220');
              CREATE TABLE t2(id CHAR(16), VAR INT DEFAULT NULL, PRIMARY KEY(id));
              INSERT INTO t2 VALUES ('786220',1);
              SELECT id, VAR FROM t1 LEFT JOIN t2 USING(id) GROUP BY id;
              
              +--------+------+
              | id     | VAR  |
              +--------+------+
              | 78622  | NULL |
              | 786220 |    1 |
              +--------+------+
              
              Show
              takuya Takuya Aoki added a comment - Thank you for your comment, I tested the below cases and it seems their are condition when this bug occurs. RESULT: t1 t2 NULL_ERROR p p YES i p YES p i NO i i NO none none NO none p NO p none NO p p (GROUP BY id DESC) NO p p (INSERT INTO t2 VALUES ('786220',1); instead) NO DROP TABLE IF EXISTS t1; DROP TABLE IF EXISTS t2; CREATE TABLE t1(id CHAR(16), PRIMARY KEY(id)); INSERT INTO t1 VALUES ('78622'); INSERT INTO t1 VALUES ('786220'); CREATE TABLE t2(id CHAR(16), VAR INT DEFAULT NULL, PRIMARY KEY(id)); INSERT INTO t2 VALUES ('78622',1); SELECT id, VAR FROM t1 LEFT JOIN t2 USING(id) GROUP BY id; +--------+------+ | id | VAR | +--------+------+ | 78622 | NULL | | 786220 | NULL | +--------+------+ DROP TABLE IF EXISTS t1; DROP TABLE IF EXISTS t2; CREATE TABLE t1(id CHAR(16), INDEX(id)); INSERT INTO t1 VALUES ('78622'); INSERT INTO t1 VALUES ('786220'); CREATE TABLE t2(id CHAR(16), VAR INT DEFAULT NULL, PRIMARY KEY(id)); INSERT INTO t2 VALUES ('78622',1); SELECT id, VAR FROM t1 LEFT JOIN t2 USING(id) GROUP BY id; +--------+------+ | id | VAR | +--------+------+ | 78622 | NULL | | 786220 | NULL | +--------+------+ DROP TABLE IF EXISTS t1; DROP TABLE IF EXISTS t2; CREATE TABLE t1(id CHAR(16), PRIMARY KEY(id)); INSERT INTO t1 VALUES ('78622'); INSERT INTO t1 VALUES ('786220'); CREATE TABLE t2(id CHAR(16), VAR INT DEFAULT NULL, INDEX(id)); INSERT INTO t2 VALUES ('78622',1); SELECT id, VAR FROM t1 LEFT JOIN t2 USING(id) GROUP BY id; +--------+------+ | id | VAR | +--------+------+ | 78622 | 1 | | 786220 | NULL | +--------+------+ DROP TABLE IF EXISTS t1; DROP TABLE IF EXISTS t2; CREATE TABLE t1(id CHAR(16), INDEX(id)); INSERT INTO t1 VALUES ('78622'); INSERT INTO t1 VALUES ('786220'); CREATE TABLE t2(id CHAR(16), VAR INT DEFAULT NULL, INDEX(id)); INSERT INTO t2 VALUES ('78622',1); SELECT id, VAR FROM t1 LEFT JOIN t2 USING(id) GROUP BY id; +--------+------+ | id | VAR | +--------+------+ | 78622 | 1 | | 786220 | NULL | +--------+------+ DROP TABLE IF EXISTS t1; DROP TABLE IF EXISTS t2; CREATE TABLE t1(id CHAR(16)); INSERT INTO t1 VALUES ('78622'); INSERT INTO t1 VALUES ('786220'); CREATE TABLE t2(id CHAR(16), VAR INT DEFAULT NULL); INSERT INTO t2 VALUES ('78622',1); SELECT id, VAR FROM t1 LEFT JOIN t2 USING(id) GROUP BY id; +--------+------+ | id | VAR | +--------+------+ | 78622 | 1 | | 786220 | NULL | +--------+------+ DROP TABLE IF EXISTS t1; DROP TABLE IF EXISTS t2; CREATE TABLE t1(id CHAR(16)); INSERT INTO t1 VALUES ('78622'); INSERT INTO t1 VALUES ('786220'); CREATE TABLE t2(id CHAR(16), VAR INT DEFAULT NULL, PRIMARY KEY(id)); INSERT INTO t2 VALUES ('78622',1); SELECT id, VAR FROM t1 LEFT JOIN t2 USING(id) GROUP BY id; +--------+------+ | id | VAR | +--------+------+ | 78622 | 1 | | 786220 | NULL | +--------+------+ DROP TABLE IF EXISTS t1; DROP TABLE IF EXISTS t2; CREATE TABLE t1(id CHAR(16), PRIMARY KEY(id)); INSERT INTO t1 VALUES ('78622'); INSERT INTO t1 VALUES ('786220'); CREATE TABLE t2(id CHAR(16), VAR INT DEFAULT NULL); INSERT INTO t2 VALUES ('78622',1); SELECT id, VAR FROM t1 LEFT JOIN t2 USING(id) GROUP BY id; +--------+------+ | id | VAR | +--------+------+ | 78622 | 1 | | 786220 | NULL | +--------+------+ DROP TABLE IF EXISTS t1; DROP TABLE IF EXISTS t2; CREATE TABLE t1(id CHAR(16), PRIMARY KEY(id)); INSERT INTO t1 VALUES ('78622'); INSERT INTO t1 VALUES ('786220'); CREATE TABLE t2(id CHAR(16), VAR INT DEFAULT NULL, PRIMARY KEY(id)); INSERT INTO t2 VALUES ('78622',1); SELECT id, VAR FROM t1 LEFT JOIN t2 USING(id) GROUP BY id DESC; +--------+------+ | id | VAR | +--------+------+ | 786220 | NULL | | 78622 | 1 | +--------+------+ DROP TABLE IF EXISTS t1; DROP TABLE IF EXISTS t2; CREATE TABLE t1(id CHAR(16), PRIMARY KEY(id)); INSERT INTO t1 VALUES ('78622'); INSERT INTO t1 VALUES ('786220'); CREATE TABLE t2(id CHAR(16), VAR INT DEFAULT NULL, PRIMARY KEY(id)); INSERT INTO t2 VALUES ('786220',1); SELECT id, VAR FROM t1 LEFT JOIN t2 USING(id) GROUP BY id; +--------+------+ | id | VAR | +--------+------+ | 78622 | NULL | | 786220 | 1 | +--------+------+
              Hide
              takuya Takuya Aoki added a comment -

              Hello, when will this bug be fixed?
              Will it be fixed in the 10.0.16 release?

              Show
              takuya Takuya Aoki added a comment - Hello, when will this bug be fixed? Will it be fixed in the 10.0.16 release?
              Hide
              serg Sergei Golubchik added a comment -

              I am afraid that it's unlikely. 10.0.16 release is almost ready. Anything that didn't get into it by now most probably won't be in it.

              Show
              serg Sergei Golubchik added a comment - I am afraid that it's unlikely. 10.0.16 release is almost ready. Anything that didn't get into it by now most probably won't be in it.
              Hide
              takuya Takuya Aoki added a comment - - edited

              This bug was fixed in the 10.0.16 release (Revision #4588).

              Show
              takuya Takuya Aoki added a comment - - edited This bug was fixed in the 10.0.16 release (Revision #4588).
              Hide
              psergey Sergei Petrunia added a comment -

              Right, fixed by patch for MDEV-5719. Thanks for noticing this.

              Show
              psergey Sergei Petrunia added a comment - Right, fixed by patch for MDEV-5719 . Thanks for noticing this.
              Hide
              psergey Sergei Petrunia added a comment -

              Closing as Duplicate of MDEV-5719.

              Show
              psergey Sergei Petrunia added a comment - Closing as Duplicate of MDEV-5719 .

                People

                • Assignee:
                  psergey Sergei Petrunia
                  Reporter:
                  takuya Takuya Aoki
                • Votes:
                  2 Vote for this issue
                  Watchers:
                  6 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved: