Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6879

Dereference of NULL primary_file->table in DsMrr_impl::get_disk_sweep_mrr_cost()

    Details

      Description

      When joining against a derived table where MRR is chosen by the optimizer, in DsMrr_impl::get_disk_sweep_mrr_cost, primary_file->table (accessed as table) is NULL but is dereferenced.

      #3  0x0000000001000a9c in DsMrr_impl::get_disk_sweep_mrr_cost (this=this@entry=0x267c7278, keynr=keynr@entry=0, rows=rows@entry=20, flags=<optimized out>, buffer_size=buffer_size@entry=0x7f223bd5cc8c, cost=cost@entry=0x7f223bd5cb70) at sql/multi_range_read.cc:1722
      
      1711|   /* Adjust buffer size if we expect to use only part of the buffer */
      1712|   if (n_full_steps)
      1713|   {
      1714|     get_sort_and_sweep_cost(table, rows_in_full_step, cost);
      1715|     cost->multiply(n_full_steps);
      1716|   }
      1717|   else
      1718|   {
      1719|     cost->reset();
      1720|     *buffer_size= MY_MAX(*buffer_size,
      1721|                       (size_t)(1.2*rows_in_last_step) * elem_size +
      1722+>                      primary_file->ref_length + table->key_info[keynr].key_length);
      1723|   }
      
      (gdb) p primary_file
      $1 = (handler *) 0x267c6e20
      (gdb) p table
      $2 = (TABLE *) 0x0
      

      Unfortunately I don't have a non-sensitive reproducible test case to provide, but the following patch fixes the problem for us by disabling MRR for joins against derived tables. It's unclear if this is the right solution or if it's a "big hammer" approach – alternate approaches are welcome.

      Patch follows:

      --- sql/sql_select.cc    2014-07-21 17:18:26.000000000 -0700
      +++ sql/sql_select.cc    2014-09-18 22:46:52.000000000 -0700
      @@ -10595,6 +10595,9 @@ uint check_join_cache_usage(JOIN_TAB *ta
           if (tab->ref.is_access_triggered())
             goto no_join_cache;
             
      +    if (tab->table->pos_in_table_list->is_materialized_derived())
      +      goto no_join_cache;
      +
           if (!tab->is_ref_for_hash_join())
           {
             flags= HA_MRR_NO_NULL_ENDPOINTS | HA_MRR_SINGLE_POINT;
      

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            psergey Sergei Petrunia added a comment -

            Testcase:

            create table ten(a int);
            insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
            
            create table one_k(a int);
            insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C;
            create table t10 (a int);
            insert into t10 select * from ten;
            
            create table t12 (a int, b int, c text);
            insert into t12 select a,a,'blob-data' from one_k;
            set join_cache_level=6;
            set @@optimizer_switch='derived_merge=on,derived_with_keys=on,mrr=on';
            explain 
            select * from 
              t10 join
              (select * from t12 order by a limit 1000) as D1 
            where 
              D1.a= t10.a;
            
            Show
            psergey Sergei Petrunia added a comment - Testcase: create table ten(a int); insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9); create table one_k(a int); insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C; create table t10 (a int); insert into t10 select * from ten; create table t12 (a int, b int, c text); insert into t12 select a,a,'blob-data' from one_k; set join_cache_level=6; set @@optimizer_switch='derived_merge=on,derived_with_keys=on,mrr=on'; explain select * from t10 join (select * from t12 order by a limit 1000) as D1 where D1.a= t10.a;
            Hide
            psergey Sergei Petrunia added a comment -

            The TABLE object is present of course:

            (gdb) p tab->table->alias.Ptr
              $111 = 0x7fff98008418 "D1"
            (gdb) p tab->table
              $113 = (TABLE *) 0x7fff980a1838
            (gdb) p tab->table->file
              $115 = (ha_maria *) 0x7fff980a3038
            (gdb) p tab->table->file->table
              $117 = (TABLE *) 0x0
            

            The problem is that temporary tables do not have handler->table set. I think, the fix is to just set handler->table in create_tmp_table. I'll check with others.

            Show
            psergey Sergei Petrunia added a comment - The TABLE object is present of course: (gdb) p tab->table->alias.Ptr $111 = 0x7fff98008418 "D1" (gdb) p tab->table $113 = (TABLE *) 0x7fff980a1838 (gdb) p tab->table->file $115 = (ha_maria *) 0x7fff980a3038 (gdb) p tab->table->file->table $117 = (TABLE *) 0x0 The problem is that temporary tables do not have handler->table set. I think, the fix is to just set handler->table in create_tmp_table. I'll check with others.
            Hide
            psergey Sergei Petrunia added a comment - - edited

            For normal tables, handler::table is set in handler::ha_open().

            For temporary tables, the following happens:

            • create_tmp_table(... bool do_not_open=true) is called.
            • then, join optimization happens
            • handler->ha_open() is called late, at execution:
            (gdb) wher
              #0  handler::ha_open (this=0x7fffa4059678, table_arg=0x7fffa4057258, name=0x7fffa4058530 "/tmp/#sql_560a_0", mode=2, test_if_locked=516) at /home/psergey/dev2/10.0/sql/handler.cc:2476
              #1  0x00000000006c18d9 in open_tmp_table (table=0x7fffa4057258) at /home/psergey/dev2/10.0/sql/sql_select.cc:16614
              #2  0x000000000063bb9d in mysql_derived_create (thd=0x3211600, lex=0x3215248, derived=0x7fffa4054bc0) at /home/psergey/dev2/10.0/sql/sql_derived.cc:855
              #3  0x000000000063a8a2 in mysql_handle_single_derived (lex=0x3215248, derived=0x7fffa4054bc0, phases=96) at /home/psergey/dev2/10.0/sql/sql_derived.cc:192
              #4  0x00000000006b5099 in st_join_table::preread_init (this=0x7fffa405b9a0) at /home/psergey/dev2/10.0/sql/sql_select.cc:11384
              #5  0x00000000006c3a6b in sub_select (join=0x7fffa4055948, join_tab=0x7fffa405b9a0, end_of_records=false) at /home/psergey/dev2/10.0/sql/sql_select.cc:17616
              #6  0x00000000006c4324 in evaluate_join_record (join=0x7fffa4055948, join_tab=0x7fffa405b678, error=0) at /home/psergey/dev2/10.0/sql/sql_select.cc:17870
              #7  0x00000000006c3c1d in sub_select (join=0x7fffa4055948, join_tab=0x7fffa405b678, end_of_records=false) at /home/psergey/dev2/10.0/sql/sql_select.cc:17648
              #8  0x00000000006c348b in do_select (join=0x7fffa4055948, fields=0x3215b10, table=0x0, procedure=0x0) at /home/psergey/dev2/10.0/sql/sql_select.cc:17310
              #9  0x00000000006a061a in JOIN::exec_inner (this=0x7fffa4055948) at /home/psergey/dev2/10.0/sql/sql_select.cc:3080
              #10 0x000000000069d902 in JOIN::exec (this=0x7fffa4055948) at /home/psergey/dev2/10.0/sql/sql_select.cc:2370
              #11 0x00000000006a0ec1 in mysql_select (thd=0x3211600, rref_pointer_array=0x3215c70, tables=0x7fffa4008b98, wild_num=1, fields=..., conds=0x7fffa40557d8, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147748608, result=0x7fffa400a8d8, unit=0x3215310, select_lex=0x32159f8) at /home/psergey/dev2/10.0/sql/sql_select.cc:3308
            

            That way, DS-MRR functions are called before table->ha_open() is called.

            Show
            psergey Sergei Petrunia added a comment - - edited For normal tables, handler::table is set in handler::ha_open(). For temporary tables, the following happens: create_tmp_table(... bool do_not_open=true) is called. then, join optimization happens handler->ha_open() is called late, at execution: (gdb) wher #0 handler::ha_open (this=0x7fffa4059678, table_arg=0x7fffa4057258, name=0x7fffa4058530 "/tmp/#sql_560a_0", mode=2, test_if_locked=516) at /home/psergey/dev2/10.0/sql/handler.cc:2476 #1 0x00000000006c18d9 in open_tmp_table (table=0x7fffa4057258) at /home/psergey/dev2/10.0/sql/sql_select.cc:16614 #2 0x000000000063bb9d in mysql_derived_create (thd=0x3211600, lex=0x3215248, derived=0x7fffa4054bc0) at /home/psergey/dev2/10.0/sql/sql_derived.cc:855 #3 0x000000000063a8a2 in mysql_handle_single_derived (lex=0x3215248, derived=0x7fffa4054bc0, phases=96) at /home/psergey/dev2/10.0/sql/sql_derived.cc:192 #4 0x00000000006b5099 in st_join_table::preread_init (this=0x7fffa405b9a0) at /home/psergey/dev2/10.0/sql/sql_select.cc:11384 #5 0x00000000006c3a6b in sub_select (join=0x7fffa4055948, join_tab=0x7fffa405b9a0, end_of_records=false) at /home/psergey/dev2/10.0/sql/sql_select.cc:17616 #6 0x00000000006c4324 in evaluate_join_record (join=0x7fffa4055948, join_tab=0x7fffa405b678, error=0) at /home/psergey/dev2/10.0/sql/sql_select.cc:17870 #7 0x00000000006c3c1d in sub_select (join=0x7fffa4055948, join_tab=0x7fffa405b678, end_of_records=false) at /home/psergey/dev2/10.0/sql/sql_select.cc:17648 #8 0x00000000006c348b in do_select (join=0x7fffa4055948, fields=0x3215b10, table=0x0, procedure=0x0) at /home/psergey/dev2/10.0/sql/sql_select.cc:17310 #9 0x00000000006a061a in JOIN::exec_inner (this=0x7fffa4055948) at /home/psergey/dev2/10.0/sql/sql_select.cc:3080 #10 0x000000000069d902 in JOIN::exec (this=0x7fffa4055948) at /home/psergey/dev2/10.0/sql/sql_select.cc:2370 #11 0x00000000006a0ec1 in mysql_select (thd=0x3211600, rref_pointer_array=0x3215c70, tables=0x7fffa4008b98, wild_num=1, fields=..., conds=0x7fffa40557d8, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147748608, result=0x7fffa400a8d8, unit=0x3215310, select_lex=0x32159f8) at /home/psergey/dev2/10.0/sql/sql_select.cc:3308 That way, DS-MRR functions are called before table->ha_open() is called.
            Hide
            psergey Sergei Petrunia added a comment -

            Another question is: if DsMrr_impl::get_disk_sweep_mrr_cost() is called for a table that is not opened (and not filled) yet, it can't return a meaningful value. On the other hand, there could be a benefit in using linked MRR.

            Show
            psergey Sergei Petrunia added a comment - Another question is: if DsMrr_impl::get_disk_sweep_mrr_cost() is called for a table that is not opened (and not filled) yet, it can't return a meaningful value. On the other hand, there could be a benefit in using linked MRR.
            Hide
            jeremycole Jeremy Cole added a comment -

            Sergei Petrunia : My patch was "inspired" by these two checks, by the way:

            From sql/sql_select.cc:

            18071   if (tab->table->pos_in_table_list->is_materialized_derived() &&
            18072       !tab->table->pos_in_table_list->fill_me)
            18073   {
            18074     //TODO: don't get here at all
            18075     /* Skip materialized derived tables/views. */
            18076     DBUG_RETURN(0);
            18077   }
            

            From sql/opt_range.cc:

            10594   /*
            10595     Skip materialized derived table/view result table from MRR check as
            10596     they aren't contain any data yet.
            10597   */
            10598   if (param->table->pos_in_table_list->is_non_derived())
            10599     rows= file->multi_range_read_info_const(keynr, &seq_if, (void*)&seq, 0,
            10600                                             bufsize, mrr_flags, cost);
            
            Show
            jeremycole Jeremy Cole added a comment - Sergei Petrunia : My patch was "inspired" by these two checks, by the way: From sql/sql_select.cc : 18071 if (tab->table->pos_in_table_list->is_materialized_derived() && 18072 !tab->table->pos_in_table_list->fill_me) 18073 { 18074 //TODO: don't get here at all 18075 /* Skip materialized derived tables/views. */ 18076 DBUG_RETURN(0); 18077 } From sql/opt_range.cc : 10594 /* 10595 Skip materialized derived table/view result table from MRR check as 10596 they aren't contain any data yet. 10597 */ 10598 if (param->table->pos_in_table_list->is_non_derived()) 10599 rows= file->multi_range_read_info_const(keynr, &seq_if, (void*)&seq, 0, 10600 bufsize, mrr_flags, cost);
            Hide
            psergey Sergei Petrunia added a comment -

            Discussed with Igor Babaev.
            Decided to keep BKA disabled in this case. One can think of a case where it could be useful, but making it work seems not worth the effort.

            Show
            psergey Sergei Petrunia added a comment - Discussed with Igor Babaev . Decided to keep BKA disabled in this case. One can think of a case where it could be useful, but making it work seems not worth the effort.
            Hide
            psergey Sergei Petrunia added a comment - - edited

            (seeing the comment above) .. especially since similar decision was already made in other locations.

            Show
            psergey Sergei Petrunia added a comment - - edited (seeing the comment above) .. especially since similar decision was already made in other locations.

              People

              • Assignee:
                psergey Sergei Petrunia
                Reporter:
                jeremycole Jeremy Cole
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: