Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-4840

Wrong result (missing rows) on LEFT JOIN with InnoDB tables

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 10.0.3, 5.5.32, 5.1.67, 5.2.14, 5.3.12
    • Fix Version/s: 10.0.5, 5.5.33
    • Component/s: None
    • Labels:
      None

      Description

      With the test case below, the first query produces 3 rows, which I believe to be the correct result:

      SELECT * FROM iso_code LEFT JOIN ( city LEFT JOIN country ON city.country_code  = country.code ) ON iso_code.alpha3 = country.code ;
      alpha3	country_code	name	code	name
      USA	USA	Austin	USA	United States
      USA	USA	Boston	USA	United States
      CAN	NULL	NULL	NULL	NULL
      

      But the second query, which only differs from the previous one by the select list, produces two rows:

      SELECT iso_code.alpha3 FROM iso_code LEFT JOIN ( city LEFT JOIN country ON city.country_code  = country.code ) ON iso_code.alpha3 = country.code ;
      alpha3
      USA
      CAN
      

      Reproducible on MariaDB 5.1 from the beginning of time (tried 5.1.42), 5.2, 5.3, 5.5, 10.0.
      Not reproducible on MySQL 5.1, 5.5, 5.6.

      
      --source include/have_innodb.inc
      
      CREATE TABLE iso_code (alpha3 VARCHAR(3)) ENGINE=InnoDB;
      INSERT INTO iso_code VALUES ('USA'),('CAN');
      
      CREATE TABLE city ( country_code VARCHAR(3), name VARCHAR(64)) ENGINE=InnoDB;
      INSERT INTO city VALUES ('USA','Austin'),('USA','Boston');
      
      CREATE TABLE country ( code VARCHAR(3),  name VARCHAR(64),  PRIMARY KEY (code),  UNIQUE KEY (name)) ENGINE=InnoDB;
      INSERT INTO country VALUES ('CAN','Canada'),('USA','United States');
      
      SELECT * FROM iso_code LEFT JOIN ( city LEFT JOIN country ON city.country_code  = country.code ) ON iso_code.alpha3 = country.code ;
      SELECT iso_code.alpha3 FROM iso_code LEFT JOIN ( city LEFT JOIN country ON city.country_code  = country.code ) ON iso_code.alpha3 = country.code ;
      
      
      DROP TABLE iso_code, city, country;
      

      EXPLAIN:

      EXPLAIN EXTENDED
      SELECT iso_code.alpha3 FROM iso_code LEFT JOIN ( city LEFT JOIN country ON city.country_code  = country.code ) ON iso_code.alpha3 = country.code ;
      id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
      1	SIMPLE	iso_code	ALL	NULL	NULL	NULL	NULL	2	100.00	
      Warnings:
      Note	1003	select `test`.`iso_code`.`alpha3` AS `alpha3` from `test`.`iso_code` where 1
      

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            psergey Sergei Petrunia added a comment -

            It seems, the problem is here:
            (gdb) wher
            #0 Dep_module::touch (this=0xa1704c0) at /home/psergey/dev2/5.5/sql/opt_table_elimination.cc:358
            #1 0x083850e2 in Dep_analysis_context::run_wave (this=0x991e5654, new_bound_modules=0x991e5780) at /home/psergey/dev2/5.5/sql/opt_table_elimination.cc:914
            #2 0x0838673e in check_func_dependency (join=0xa16edf8, dep_tables=6, it=0x991e5800, oj_tbl=0x0, cond=0xa16f1c8) at /home/psergey/dev2/5.5/sql/opt_table_elimination.cc:848
            #3 0x08386a3c in eliminate_tables_for_list (join=0xa16edf8, join_list=0xa152c90, list_tables=6, on_expr=0xa16f1c8, tables_used_elsewhere=1) at /home/psergey/dev2/5.5/sql/opt_table_elimination.cc:756
            #4 0x08386857 in eliminate_tables_for_list (join=0xa16edf8, join_list=0xa13cd7c, list_tables=7, on_expr=0x0, tables_used_elsewhere=1) at /home/psergey/dev2/5.5/sql/opt_table_elimination.cc:716
            #5 0x08386ddc in eliminate_tables (join=0xa16edf8) at /home/psergey/dev2/5.5/sql/opt_table_elimination.cc:655
            #6 0x0828f25c in make_join_statistics (join=0xa16edf8, tables_list=..., conds=0x0, keyuse_array=0xa16eff8) at /home/psergey/dev2/5.5/sql/sql_select.cc:3360
            #7 0x08291f83 in JOIN::optimize (this=0xa16edf8) at /home/psergey/dev2/5.5/sql/sql_select.cc:1209

            (gdb) p this
            $134 = (Dep_module_goal *) 0xa1704c0

            358 void touch()

            { unbound_args--; }

            The code in Dep_module::touch() assumes that it is called from different sources.
            In our case:

            1. table "country" has two unique keys.
            2. table "city" has no indexes at all.

            #2 should prevent Dep_module_goal from ever being marked as bound. However, #1 causes dep_module_goa->touch() be called twice, and the optimizer incorrectly assumes Dep_module_goal is satisfied.

            Show
            psergey Sergei Petrunia added a comment - It seems, the problem is here: (gdb) wher #0 Dep_module::touch (this=0xa1704c0) at /home/psergey/dev2/5.5/sql/opt_table_elimination.cc:358 #1 0x083850e2 in Dep_analysis_context::run_wave (this=0x991e5654, new_bound_modules=0x991e5780) at /home/psergey/dev2/5.5/sql/opt_table_elimination.cc:914 #2 0x0838673e in check_func_dependency (join=0xa16edf8, dep_tables=6, it=0x991e5800, oj_tbl=0x0, cond=0xa16f1c8) at /home/psergey/dev2/5.5/sql/opt_table_elimination.cc:848 #3 0x08386a3c in eliminate_tables_for_list (join=0xa16edf8, join_list=0xa152c90, list_tables=6, on_expr=0xa16f1c8, tables_used_elsewhere=1) at /home/psergey/dev2/5.5/sql/opt_table_elimination.cc:756 #4 0x08386857 in eliminate_tables_for_list (join=0xa16edf8, join_list=0xa13cd7c, list_tables=7, on_expr=0x0, tables_used_elsewhere=1) at /home/psergey/dev2/5.5/sql/opt_table_elimination.cc:716 #5 0x08386ddc in eliminate_tables (join=0xa16edf8) at /home/psergey/dev2/5.5/sql/opt_table_elimination.cc:655 #6 0x0828f25c in make_join_statistics (join=0xa16edf8, tables_list=..., conds=0x0, keyuse_array=0xa16eff8) at /home/psergey/dev2/5.5/sql/sql_select.cc:3360 #7 0x08291f83 in JOIN::optimize (this=0xa16edf8) at /home/psergey/dev2/5.5/sql/sql_select.cc:1209 (gdb) p this $134 = (Dep_module_goal *) 0xa1704c0 358 void touch() { unbound_args--; } The code in Dep_module::touch() assumes that it is called from different sources. In our case: 1. table "country" has two unique keys. 2. table "city" has no indexes at all. #2 should prevent Dep_module_goal from ever being marked as bound. However, #1 causes dep_module_goa->touch() be called twice, and the optimizer incorrectly assumes Dep_module_goal is satisfied.
            Hide
            psergey Sergei Petrunia added a comment -

            This patch fixes the testcase for this bug:

            === modified file 'sql/opt_table_elimination.cc'
            — sql/opt_table_elimination.cc 2012-02-17 11:19:38 +0000
            +++ sql/opt_table_elimination.cc 2013-08-21 18:02:45 +0000
            @@ -892,8 +892,11 @@ bool Dep_analysis_context::run_wave(List
            iter= module->init_unbound_values_iter(iter_buf);
            while ((value= module->get_next_unbound_value(this, iter)))
            {

            • value->make_bound();
            • new_bound_values.push_back(value);
              + if (!value->is_bound())
              + { + value->make_bound(); + new_bound_values.push_back(value); + }

              }
              }
              new_bound_modules->empty();

            what I don't understand is why the testcase fails only with InnoDB. The problem that is fixed by this patch is orthogonal to the storage engine being used.

            Show
            psergey Sergei Petrunia added a comment - This patch fixes the testcase for this bug: === modified file 'sql/opt_table_elimination.cc' — sql/opt_table_elimination.cc 2012-02-17 11:19:38 +0000 +++ sql/opt_table_elimination.cc 2013-08-21 18:02:45 +0000 @@ -892,8 +892,11 @@ bool Dep_analysis_context::run_wave(List iter= module->init_unbound_values_iter(iter_buf); while ((value= module->get_next_unbound_value(this, iter))) { value->make_bound(); new_bound_values.push_back(value); + if (!value->is_bound()) + { + value->make_bound(); + new_bound_values.push_back(value); + } } } new_bound_modules->empty(); what I don't understand is why the testcase fails only with InnoDB. The problem that is fixed by this patch is orthogonal to the storage engine being used.
            Hide
            psergey Sergei Petrunia added a comment -

            CREATE TABLE country ( code VARCHAR(3), name VARCHAR(64), PRIMARY KEY (code), UNIQUE KEY (name)) ENGINE=InnoDB;
            ...

            Breakpoint 14, Dep_value_field::get_next_unbound_module (this=0x7fff9400af08, dac=0x7fffc87c5e00, iter=0x7fffc87c5d00 "`\311\002\224\377\177") at /home/psergey/dev2/5.5/sql/opt_table_elimination.cc:1742
            (gdb) p field->field_name
            $188 = 0x7fff9401ee59 "code"
            (gdb) p *field->table_name
            $189 = 0x7fff9400da90 "country"
            (gdb) p key_dep->keyno
            $190 = 1

            That is, country.code is considered to be covered by key#1 in table `country`.
            Key #1 is UNIQUE KEY(name). This is extended keys feature at work.

            However, the constructor for this unique key assumes that the key as one key
            part:

            Breakpoint 13, Dep_module_key::Dep_module_key (this=0x7fff9402c960, table_arg=0x7fff9402c910, keyno_arg=1, n_parts_arg=1) at /home/psergey/dev2/5.5/sql/opt_table_elimination.cc:422

            note above: keyno_arg=1, n_parts_arg=1

            Somehow, extended-keys property is taken into account in one place but not in the other.

            Show
            psergey Sergei Petrunia added a comment - CREATE TABLE country ( code VARCHAR(3), name VARCHAR(64), PRIMARY KEY (code), UNIQUE KEY (name)) ENGINE=InnoDB; ... Breakpoint 14, Dep_value_field::get_next_unbound_module (this=0x7fff9400af08, dac=0x7fffc87c5e00, iter=0x7fffc87c5d00 "`\311\002\224\377\177") at /home/psergey/dev2/5.5/sql/opt_table_elimination.cc:1742 (gdb) p field->field_name $188 = 0x7fff9401ee59 "code" (gdb) p *field->table_name $189 = 0x7fff9400da90 "country" (gdb) p key_dep->keyno $190 = 1 That is, country.code is considered to be covered by key#1 in table `country`. Key #1 is UNIQUE KEY(name). This is extended keys feature at work. However, the constructor for this unique key assumes that the key as one key part: Breakpoint 13, Dep_module_key::Dep_module_key (this=0x7fff9402c960, table_arg=0x7fff9402c910, keyno_arg=1, n_parts_arg=1) at /home/psergey/dev2/5.5/sql/opt_table_elimination.cc:422 note above: keyno_arg=1, n_parts_arg=1 Somehow, extended-keys property is taken into account in one place but not in the other.
            Hide
            psergey Sergei Petrunia added a comment -

            Dep_value_table *Dep_analysis_context::create_table_value(TABLE *table)

            has this code

            if (key->flags & HA_NOSAME)
            {
            Dep_module_key *key_dep;
            if (!(key_dep= new Dep_module_key(tbl_dep, i, key->key_parts)))

            As for key->key_parts:

            (gdb) p *key
            $200 = {key_length = 69, flags = 105, key_parts = 1, usable_key_parts = 2,
            ext_key_parts = 2 ...

            It seems, usable_key_parts should be used instead?

            Show
            psergey Sergei Petrunia added a comment - Dep_value_table *Dep_analysis_context::create_table_value(TABLE *table) has this code if (key->flags & HA_NOSAME) { Dep_module_key *key_dep; if (!(key_dep= new Dep_module_key(tbl_dep, i, key->key_parts))) As for key->key_parts: (gdb) p *key $200 = {key_length = 69, flags = 105, key_parts = 1, usable_key_parts = 2, ext_key_parts = 2 ... It seems, usable_key_parts should be used instead?
            Hide
            psergey Sergei Petrunia added a comment -

            Handling extended keys in table elimination.

            Table elimination uses primary/unique key definitions to know which set of
            columns uniquely defines the table record.

            Extending the binding column set makes things worse for table elimination.
            Without extended keys, table elimination sees

            UNIQUE KEY(col1),
            PRIMARY KEY (pk_col)

            and is able to infer that "col1=...." makes the table bound.

            With extended keys, table elimination sees:

            UNIQUE KEY(col1, pk_col)
            PRIMARY KEY (pk_col)

            and this doesn't allow to infer that "col1=..." makes the table bound.

            Show
            psergey Sergei Petrunia added a comment - Handling extended keys in table elimination. Table elimination uses primary/unique key definitions to know which set of columns uniquely defines the table record. Extending the binding column set makes things worse for table elimination. Without extended keys, table elimination sees UNIQUE KEY(col1), PRIMARY KEY (pk_col) and is able to infer that "col1=...." makes the table bound. With extended keys, table elimination sees: UNIQUE KEY(col1, pk_col) PRIMARY KEY (pk_col) and this doesn't allow to infer that "col1=..." makes the table bound.

              People

              • Assignee:
                psergey Sergei Petrunia
                Reporter:
                elenst Elena Stepanova
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: