Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5123

Remove duplicated conditions pushed both to join_tab->select_cond and join_tab->cache_select->cond for blocked joins.

    Details

    • Type: Task
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Fix Version/s: 10.0.5
    • Component/s: None
    • Labels:
      None

      Description

      If there is a condition that depends only on the second table in a blocked join,
      the following blocked join algorithms (BNL, BNLH) use this condition to pre-filter records from this table prior to joining the records.

      Currently the optimizer extracts such a single table condition independently from the generic condition pushdown. As a result the same condition may be evaluated up to two times more. This is a problem if the condition is expensive. It also makes it problematic to move the condition to an optimal partial join (see MDEV-83).

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

              Hide
              timour Timour Katchaounov added a comment -

              "Generic" pushdown conditions are attached to specific joins via the member: JOIN_TAB::select_cond.
              The filter conditions used by blocked join algorithms are attached to JOIN_TAB::cache_select::cond.

              Currently the extraction of these conditions happens in the following places in the optimizer:

              JOIN::optimize_inner calls:

              • make_join_select
                Step #2: Extract WHERE/ON parts
                ...
                tmp= make_cond_for_table(thd, cond, used_tables, current_map,
                FALSE, FALSE);
                ...
                if (tmp || !cond || tab->type == JT_REF || tab->type == JT_REF_OR_NULL ||
                tab->type == JT_EQ_REF || first_inner_tab)
                {
                ...
                else if (tab->type == JT_ALL && ! use_quick_range)
                Unknown macro: { .... if (i != join->const_tables && tab->use_quick != 2 && !tab->first_inner) { /* Read with cache */ if (tab->make_scan_filter()) DBUG_RETURN(1); } .... }

                }

              • make_join_readinfo
              • check_join_cache_usage_for_tables()
                if (!tab->hash_join_is_possible() ||
                tab->make_scan_filter())
                goto no_join_cache;
                ...
                tab->cache= new JOIN_CACHE_BNLH(...)
              Show
              timour Timour Katchaounov added a comment - "Generic" pushdown conditions are attached to specific joins via the member: JOIN_TAB::select_cond. The filter conditions used by blocked join algorithms are attached to JOIN_TAB::cache_select::cond. Currently the extraction of these conditions happens in the following places in the optimizer: JOIN::optimize_inner calls: make_join_select Step #2: Extract WHERE/ON parts ... tmp= make_cond_for_table(thd, cond, used_tables, current_map, FALSE, FALSE); ... if (tmp || !cond || tab->type == JT_REF || tab->type == JT_REF_OR_NULL || tab->type == JT_EQ_REF || first_inner_tab) { ... else if (tab->type == JT_ALL && ! use_quick_range) Unknown macro: { .... if (i != join->const_tables && tab->use_quick != 2 && !tab->first_inner) { /* Read with cache */ if (tab->make_scan_filter()) DBUG_RETURN(1); } .... } } make_join_readinfo check_join_cache_usage_for_tables() if (!tab->hash_join_is_possible() || tab->make_scan_filter()) goto no_join_cache; ... tab->cache= new JOIN_CACHE_BNLH(...)
              Hide
              timour Timour Katchaounov added a comment -

              Low-level design:

              1. Split JOIN_TAB::select_cond into two conditions:

              • select_cond_for_table
              • select_cond

              2. Instead of extracting the whole pushdown condition, make_join_select extracts the above two conditions.
              The current call:
              tmp= make_cond_for_table(thd, cond, used_tables, current_map,
              FALSE, FALSE);
              is substituted by two calls:
              select_cond_for_table= make_cond_for_table(thd, cond, (const_table_map | table->map), table->map, FALSE, TRUE);
              select_cond= make_cond_for_table(thd, cond, used_tables, current_map, FALSE, FALSE);

              3. The calls to make_scan_filter() will not extract the single table condition they used to.
              Instead they will use the already extracted condition JOIN_TAB::select_cond_for_table.
              Once select_cond_for_table is used, set it to NULL.

              4. In the end of make_join_read_info set the final pushdown condition to:
              (select_cond_for_table AND select_cond).

              Show
              timour Timour Katchaounov added a comment - Low-level design: 1. Split JOIN_TAB::select_cond into two conditions: select_cond_for_table select_cond 2. Instead of extracting the whole pushdown condition, make_join_select extracts the above two conditions. The current call: tmp= make_cond_for_table(thd, cond, used_tables, current_map, FALSE, FALSE); is substituted by two calls: select_cond_for_table= make_cond_for_table(thd, cond, (const_table_map | table->map), table->map, FALSE, TRUE); select_cond= make_cond_for_table(thd, cond, used_tables, current_map, FALSE, FALSE); 3. The calls to make_scan_filter() will not extract the single table condition they used to. Instead they will use the already extracted condition JOIN_TAB::select_cond_for_table. Once select_cond_for_table is used, set it to NULL. 4. In the end of make_join_read_info set the final pushdown condition to: (select_cond_for_table AND select_cond).
              Hide
              psergey Sergei Petrunia added a comment -

              So, after this task, instead of extracting one condition:

              table_n_cond = make_cond_for_table(....)

              we will always extract two:

              table_n_cond_for_table = ...
              table_n_select_cond =...

              and will then use

              table_n_cond_for_table AND table_n_select_cond

              this approach may cause a problem:

              "table_n_cond" is not always the same as "table_n_cond_for_table AND table_n_select_cond"

              Show
              psergey Sergei Petrunia added a comment - So, after this task, instead of extracting one condition: table_n_cond = make_cond_for_table(....) we will always extract two: table_n_cond_for_table = ... table_n_select_cond =... and will then use table_n_cond_for_table AND table_n_select_cond this approach may cause a problem: "table_n_cond" is not always the same as "table_n_cond_for_table AND table_n_select_cond"
              Hide
              timour Timour Katchaounov added a comment -

              The attached diff implements the above incomplete idea.

              Show
              timour Timour Katchaounov added a comment - The attached diff implements the above incomplete idea.
              Hide
              timour Timour Katchaounov added a comment - - edited

              The task will be implemented according the following idea porposed by Sergey Petrunia:

              for each top-level AND-item in select_cond

              { if (item->used_tables() is covered by current_table | const_tables) remove item from select_cond }

              These conjuncts can be removed from select_cond because they are false
              if cache_cond->select is false. This is so, because cache_cond->select is the
              maximal condition that depends only on (current_table | const_tables).

              Show
              timour Timour Katchaounov added a comment - - edited The task will be implemented according the following idea porposed by Sergey Petrunia: for each top-level AND-item in select_cond { if (item->used_tables() is covered by current_table | const_tables) remove item from select_cond } These conjuncts can be removed from select_cond because they are false if cache_cond->select is false. This is so, because cache_cond->select is the maximal condition that depends only on (current_table | const_tables).
              Hide
              psergey Sergei Petrunia added a comment -

              Had a review discussion on the latest variant of the patch. Approved, after requested changes are implemented.

              Show
              psergey Sergei Petrunia added a comment - Had a review discussion on the latest variant of the patch. Approved, after requested changes are implemented.
              Hide
              timour Timour Katchaounov added a comment -

              The task is pushed to 5.5-timour (because this is my only tree that runs under buildbot).

              • All review comments implemented,
              • All tests pass,
              • Submitted for testing to Elena.
              Show
              timour Timour Katchaounov added a comment - The task is pushed to 5.5-timour (because this is my only tree that runs under buildbot). All review comments implemented, All tests pass, Submitted for testing to Elena.
              Hide
              timour Timour Katchaounov added a comment -
              • reviewed by Sergey Petrunia
              • all review comments implemented
              • tested by Elena
              • pushed to 10.0-base (10.0.5)
              Show
              timour Timour Katchaounov added a comment - reviewed by Sergey Petrunia all review comments implemented tested by Elena pushed to 10.0-base (10.0.5)

                People

                • Assignee:
                  timour Timour Katchaounov
                  Reporter:
                  timour Timour Katchaounov
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  2 Start watching this issue

                  Dates

                  • Due:
                    Created:
                    Updated:
                    Resolved:

                    Time Tracking

                    Estimated:
                    Original Estimate - 2 days Original Estimate - 2 days
                    2d
                    Remaining:
                    Remaining Estimate - 0 minutes
                    0m
                    Logged:
                    Time Spent - 4 days, 3 hours
                    4d 3h