Details

    • Type: Task
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Fix Version/s: 10.2
    • Component/s: None
    • Labels:

      Description

      (this all started from https://bugs.launchpad.net/maria/+bug/929732)

      Currently, nested semi-joins are merged together into one semi-join. This makes semi-join optimization/execution simpler.

      This approach has downsides, though. If there is a semi-join subquery with multiple child semi-join subqueries, we will merge them all together into a semi-join that includes a cross-product join of the child subqueries. Running a cross-product join and then removing duplicates can be slow (see the launchpad bug for detailed description).

      Nested semi-join support
      ========================

      The idea is to support nested semi-joins. It will be possible for one semi-join
      nest to contain another.

      Semi-join nests will not be mixed with outer join nests.

      == Optimization

      === Table pullout
      current pull_out_semijoin_tables() will not work for nested semi-joins. With
      nested semi-joins, we should:

      • process inner semi-joins first, then their parents,
      • pull out of inner semi-join to the parent semi-join (and then pull out
        further to the top-level),
      • if we were doing pullout for a top-level semi-join S, and ended up in
        a situation where S has only semi-joins as its children, then S should be
        removed and its children made top-level.

      (TODO: what about PS and re-executions? we need to be able to keep the
      nested-semi-join structure permanent, and results of pullout must be kept
      transient...)

      === SJ-Materialization

      optimize_semijoin_nests() performs pre-optimization of potential
      SJ-Materialization nests (so that we know what cost to put there when we
      consider materialization)

      It is fairly trivial that we should run optimize_semijoin_nests() for
      innermost semi-join nests that are eligible for materialization.

      It is not obvious what to do for a semi-join nest that is

      • eligible for materialization,
      • has child semi-join nests.
        Should one run optimization separately, and chose the best semi-join
        strategy for children (which may or may not be materialization?)

      == Semi-join optimization as part of join optimization

      Re-worked advance_sj_state() makes it not as hopeless as before.

      TODO: Can we just postpone semi-join flattening until table pullout is done?
      this will resolve the bug...

      TODO: does current query plan representation allow for nesting?

      === FirstMatch
      === SJ-Materialization
      === Loose Scan
      === Duplicate Weedout

      == Execution
      === Duplicate Weedout

      • execution code supports nested.

      === FirstMatch

      • ?

      === SJ-Materialization

      • nested execution seems to be easy and automatic? We allow use of any
        sj-strategies inside the SJ-Materialization join_tabs...

      === Loose Scan

      • ?

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

              There are no comments yet on this issue.

                People

                • Assignee:
                  psergey Sergei Petrunia
                  Reporter:
                  psergey Sergei Petrunia
                • Votes:
                  2 Vote for this issue
                  Watchers:
                  5 Start watching this issue

                  Dates

                  • Created:
                    Updated:

                    Time Tracking

                    Estimated:
                    Original Estimate - Not Specified
                    Not Specified
                    Remaining:
                    Remaining Estimate - Not Specified
                    Not Specified
                    Logged:
                    Time Spent - 2 hours
                    2h