Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-154

LIMIT_ROWS_EXAMINED: query with OUTER JOIN in view or derived table, NOT IN examines 5 times more rows than the limit allows: limit ~ 30K, examined ~150K

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      The following query

      SELECT * FROM v, t2
      WHERE ( c, c ) NOT IN ( SELECT c, d FROM t2, t3 )
      LIMIT_ROWS_EXAMINED 30000;
      

      where v is a view

      CREATE VIEW v AS
        SELECT alias1.*
        FROM t1 AS alias1 LEFT OUTER JOIN t1 AS alias2
          ON alias1.a = alias2.b;
      

      finishes with the warning "Warning 1930 Query execution was interrupted. The query examined at least 150491 rows" (the number is somewhat greater with InnoDB). It is confirmed by the status variables:

      Handler_read_key        1
      Handler_read_rnd        143742
      Handler_read_rnd_next   3461
      Handler_tmp_write       3289
      

      EXPLAIN:

      id      select_type     table   type    possible_keys   key     key_len ref     rows    filtered    Extra
      1       PRIMARY alias1  system  NULL    NULL    NULL    NULL    1       100.00
      1       PRIMARY alias2  system  NULL    NULL    NULL    NULL    1       100.00
      1       PRIMARY t2      ALL     NULL    NULL    NULL    NULL    274     100.00  Using where
      2       MATERIALIZED    t3      ALL     NULL    NULL    NULL    NULL    12      100.00
      2       MATERIALIZED    t2      ALL     NULL    NULL    NULL    NULL    274     100.00  Using join buffer (flat, BNL join)
      Warnings:
      Note    1003    select 'USAEnglish' AS `a`,86 AS `b`,`test`.`t2`.`c` AS `c` from `test`.`t1` `alias1` left join `test`.`t1` `alias2` on(('USAEnglish' = NULL)) join `test`.`t2` where (not(<expr_cache><`test`.`t2`.`c`>(<in_optimizer>((`test`.`t2`.`c`,`test`.`t2`.`c`),(`test`.`t2`.`c`,`test`.`t2`.`c`) in ( <materialize> (select `test`.`t2`.`c`,`test`.`t3`.`d` from `test`.`t2` join `test`.`t3` ), <primary_index_lookup>(`test`.`t2`.`c` in <temporary table> on distinct_key where ((`test`.`t2`.`c` = `<subquery2>`.`c`) and (`test`.`t2`.`c` = `<subquery2>`.`d`))))))))
      

      Optimizer switch (default):

      index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on
      

      bzr version-info

      revision-id: timour@askmonty.org-20120213145248-tlvsd794o83odf2r
      date: 2012-02-13 16:52:48 +0200
      build-date: 2012-02-15 05:49:21 +0400
      revno: 3417
      

      Test case is attached.

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

              Hide
              timour Timour Katchaounov added a comment -

              This behavior is a natural consequence of the fact that many
              complex algorithms within the server do not check (and cannot
              easily) check for THD::killed. Typically these algorithms perform
              some pre-processing of data before a subsequent step, such as sorting,
              or duplicate removal.

              In this particular example we have a NOT IN, which requires subquery
              materialization with partial matching. Partial matching has an
              initialization step where it creates indexes of rowids from the materialized
              subquery. This step calls my_qsort(), which internally accesses the
              materialized subquery table. Even if we go over the LIMIT_ROWS_EXAMINED,
              there is no easy way to interrupt my_qsort().

              The relevant call stack is:
              #0 THD::check_examined_rows_limit (this=0x2d4c650) at sql_class.h:1981
              #1 0x00000000005da62a in handler::increment_statistics (this=0x2e2a338, offset=&system_status_var::ha_read_rnd_count) at sql_class.h:3672
              #2 0x0000000000689f47 in handler::ha_rnd_pos (this=0x2e2a338, buf=0x2e2a7e8 "\371\aVisaliaa Beachtsonounty", pos=0x2edeb58 "\340!\354\002") at sql_class.h:3802
              #3 0x000000000068649a in Ordered_key::cmp_keys_by_row_data (this=0x2edf0a8, a=2772, b=3015) at item_subselect.cc:5004
              #4 0x0000000000686589 in Ordered_key::cmp_keys_by_row_data_and_rownum (key=0x2edf0a8, a=0x2e0e1a0, b=0x2ee5298) at item_subselect.cc:5025
              #5 0x0000000000c2d00f in my_qsort2 (base_ptr=0x2edf460, count=791, size=8, cmp=0x686558 <Ordered_key::cmp_keys_by_row_data_and_rownum(Ordered_key*, unsigned long long*, unsigned long long*)>, cmp_argument=0x2edf0a8) at mf_qsort.c:164
              #6 0x0000000000686607 in Ordered_key::sort_keys (this=0x2edf0a8) at item_subselect.cc:5035
              #7 0x0000000000687620 in subselect_rowid_merge_engine::init (this=0x2ed8bc8, non_null_key_parts=0x0, partial_match_key_parts=0x2e28c28) at item_subselect.cc:5478
              #8 0x0000000000685ac1 in subselect_hash_sj_engine::exec (this=0x2e28b98) at item_subselect.cc:4750
              #9 0x000000000067b5d3 in Item_subselect::exec (this=0x2df8d88) at item_subselect.cc:587
              #10 0x000000000067bb15 in Item_in_subselect::exec (this=0x2df8d88) at item_subselect.cc:742
              #11 0x000000000067da4b in Item_in_subselect::val_bool (this=0x2df8d88) at item_subselect.cc:1465
              #12 0x0000000000605595 in Item::val_bool_result (this=0x2df8d88) at item.h:853
              #13 0x0000000000639d85 in Item_in_optimizer::val_int (this=0x2e0b7c8) at item_cmpfunc.cc:1724
              #14 0x0000000000605509 in Item::val_int_result (this=0x2e0b7c8) at item.h:849
              #15 0x00000000005ff961 in Item_cache_int::cache_value (this=0x2e4d748) at item.cc:8227
              #16 0x000000000060aa3a in Item_cache_wrapper::cache (this=0x2e4d668) at item.cc:7060
              #17 0x00000000005fcd9c in Item_cache_wrapper::val_bool (this=0x2e4d668) at item.cc:7224
              #18 0x0000000000635cdd in Item_func_not::val_int (this=0x2e01058) at item_cmpfunc.cc:331
              #19 0x0000000000791bb5 in evaluate_join_record (join=0x2e01d48, join_tab=0x2e0ddc8, error=0) at sql_select.cc:15491
              #20 0x0000000000791792 in sub_select (join=0x2e01d48, join_tab=0x2e0ddc8, end_of_records=false) at sql_select.cc:15396
              #21 0x0000000000790f16 in do_select (join=0x2e01d48, fields=0x2d4f140, table=0x0, procedure=0x0) at sql_select.cc:15057
              #22 0x0000000000772bdd in JOIN::exec (this=0x2e01d48) at sql_select.cc:2731

              Show
              timour Timour Katchaounov added a comment - This behavior is a natural consequence of the fact that many complex algorithms within the server do not check (and cannot easily) check for THD::killed. Typically these algorithms perform some pre-processing of data before a subsequent step, such as sorting, or duplicate removal. In this particular example we have a NOT IN, which requires subquery materialization with partial matching. Partial matching has an initialization step where it creates indexes of rowids from the materialized subquery. This step calls my_qsort(), which internally accesses the materialized subquery table. Even if we go over the LIMIT_ROWS_EXAMINED, there is no easy way to interrupt my_qsort(). The relevant call stack is: #0 THD::check_examined_rows_limit (this=0x2d4c650) at sql_class.h:1981 #1 0x00000000005da62a in handler::increment_statistics (this=0x2e2a338, offset=&system_status_var::ha_read_rnd_count) at sql_class.h:3672 #2 0x0000000000689f47 in handler::ha_rnd_pos (this=0x2e2a338, buf=0x2e2a7e8 "\371\aVisaliaa Beachtsonounty", pos=0x2edeb58 "\340!\354\002") at sql_class.h:3802 #3 0x000000000068649a in Ordered_key::cmp_keys_by_row_data (this=0x2edf0a8, a=2772, b=3015) at item_subselect.cc:5004 #4 0x0000000000686589 in Ordered_key::cmp_keys_by_row_data_and_rownum (key=0x2edf0a8, a=0x2e0e1a0, b=0x2ee5298) at item_subselect.cc:5025 #5 0x0000000000c2d00f in my_qsort2 (base_ptr=0x2edf460, count=791, size=8, cmp=0x686558 <Ordered_key::cmp_keys_by_row_data_and_rownum(Ordered_key*, unsigned long long*, unsigned long long*)>, cmp_argument=0x2edf0a8) at mf_qsort.c:164 #6 0x0000000000686607 in Ordered_key::sort_keys (this=0x2edf0a8) at item_subselect.cc:5035 #7 0x0000000000687620 in subselect_rowid_merge_engine::init (this=0x2ed8bc8, non_null_key_parts=0x0, partial_match_key_parts=0x2e28c28) at item_subselect.cc:5478 #8 0x0000000000685ac1 in subselect_hash_sj_engine::exec (this=0x2e28b98) at item_subselect.cc:4750 #9 0x000000000067b5d3 in Item_subselect::exec (this=0x2df8d88) at item_subselect.cc:587 #10 0x000000000067bb15 in Item_in_subselect::exec (this=0x2df8d88) at item_subselect.cc:742 #11 0x000000000067da4b in Item_in_subselect::val_bool (this=0x2df8d88) at item_subselect.cc:1465 #12 0x0000000000605595 in Item::val_bool_result (this=0x2df8d88) at item.h:853 #13 0x0000000000639d85 in Item_in_optimizer::val_int (this=0x2e0b7c8) at item_cmpfunc.cc:1724 #14 0x0000000000605509 in Item::val_int_result (this=0x2e0b7c8) at item.h:849 #15 0x00000000005ff961 in Item_cache_int::cache_value (this=0x2e4d748) at item.cc:8227 #16 0x000000000060aa3a in Item_cache_wrapper::cache (this=0x2e4d668) at item.cc:7060 #17 0x00000000005fcd9c in Item_cache_wrapper::val_bool (this=0x2e4d668) at item.cc:7224 #18 0x0000000000635cdd in Item_func_not::val_int (this=0x2e01058) at item_cmpfunc.cc:331 #19 0x0000000000791bb5 in evaluate_join_record (join=0x2e01d48, join_tab=0x2e0ddc8, error=0) at sql_select.cc:15491 #20 0x0000000000791792 in sub_select (join=0x2e01d48, join_tab=0x2e0ddc8, end_of_records=false) at sql_select.cc:15396 #21 0x0000000000790f16 in do_select (join=0x2e01d48, fields=0x2d4f140, table=0x0, procedure=0x0) at sql_select.cc:15057 #22 0x0000000000772bdd in JOIN::exec (this=0x2e01d48) at sql_select.cc:2731

                People

                • Assignee:
                  timour Timour Katchaounov
                  Reporter:
                  elenst Elena Stepanova
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  0 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved:

                    Time Tracking

                    Estimated:
                    Original Estimate - Not Specified
                    Not Specified
                    Remaining:
                    Remaining Estimate - 0 minutes
                    0m
                    Logged:
                    Time Spent - 2 hours
                    2h