Details

    • Type: Task Task
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Fix Version/s: 10.0.5
    • Labels:
      None
    • Global Rank:
      2079

      Description

      High-Level Specification

      User interface

      EXPLAIN UPDATE and EXPLAIN DELETE statements will work and produce a tabular output similar to EXPLAIN SELECT.

      Implementation

      The primary challenge will be to change UPDATE/DELETE code to first produce action plan and then act on it (and not make decisions on the go as it currently does).

      What EXPLAIN will show

      • multi-table one will show the SELECT part as usual
      • single-table statements will show an equivalent of access method.

      Besides that, we want to know

      • if sorting will be used
      • whether the UPDATE will occur on the fly or not.
      • print or not "Using filesort"

      Low-Level Design

      Possible ways of the implemetations

      1. Backport of the the implementation from the mysql-5.6 code line.

      Estimate: 2.5 weeks (100 hrs)

      I got this estimate from comparison of the patch for ORDER BY with LIMIT that had been back-ported to 10.0-base with the mysql patch that added EXPLAIN for UPDATE/DELETE/INSERT. I took into account the fact that the latter was applied to the code that had been greatly changed from the mysql 5.5 code line. My personal attitude to the patch is extremely negative: the patch is overburdened with unneeded restructuring and unjustified new classes. The patch left the code for EXPLAIN interleaved with the code for SELECT and added interleaving of EXPLAIN for UPDATE/DELETE/INSERT for other code for this statements.

      2. A new implementation that will:

      • Introduce a special class for abstract EXPLAIN objects that would allow an easy generation of of the EXPLAIN output in any format
      • Add methods that builds the EXPLAIN objects for each type of interesting statements by the chosen execution plans (plans can be changed at some fixed points during the execution)
      • Clean the current code for SELECT removing removing all code related to EXPLAIN.

      Basically the above are the basic ideas under the low level design that I could suggest.

      In the future this design will allow us to include EXPLAIN into the information schema and, as a direct consequence, will allow us to select whatever we are interested in from the EXPLAIN output.

      My estimate for the time needed for this implementation is about 200 hrs (maybe even more). This is a lot, but it's only a fraction of the time that was in reality spent for SHOW EXPLAIN that DID NOT follow this design.

      The steps to implement this are:

      • Design the class ( 1 day)
      • Implement the class (2 days)
      • Use it in DELETE to implement explain (1 day dev, 4 hour QA)
      • Use it in UPDATE
      • Change SELECT to also use the class.

      In more details

      1.

      In my opinion a correct ideal schema of any implementation for the EXPLAIN command should be like this.

      1. The optimizer phase for the statement is completed (it means that the execution plan is chosen)
      2. An internal representation of the EXPLAIN info is built by the chosen plan for the executed statement
      3. The EXPLAIN output is built in the requested format

      The current implementation of the EXPLAIN command MariaDB 5.3/5.5 and in MySQL 5.5 skips 2. and builds the EXPLAIN output directly by the execution plan for the statement.

      The new implementation the EXPLAIN command for MySQL 5.6 also skips 2.

      Here are my consideration why step 2 is needed.

      • An execution plan is a volatile object (even after the optimization phase) : we can change it during the execution. So we need to save the EXPLAIN output between the commit points for this changes to support SHOW EXPLAIN. AFAIU the difficulty of the resolution of this problem for the current EXPLAIN output was the main source of troubles in the SHOW EXPLAIN project (see the log of the project in bzr).
      • It's much easier to form the same compact representation for different statements than to form the same EXPLAIN output
      • With the internal non-string representation you have more flexibility to build EXPLAIN output in a different format.

      Here are some examples of the differences between the internal representations of the EXPLAIN output and the current representation of it.
      Column 'possible keys':

      • now it's a list of key names
      • internal representation could contain just a key map for this key
        Column 'ref:
      • now contains a list of external representation for the ref items
      • internal representation could contain just a list of references to these items

      2.

      I looked at the current implementation of the EXPLAIN command in MySQL 5.6.7.

      Currently it contains the following new files:

      opt_explain.h (77 lines)
      opt_explain.cc (2089 lines)
      opt_explain_traditional.h (50 lines)
      opt_explain_traditional.cc (238 lines)
      opt_explain_format.h (592 lines)
      opt_explain_json.h (50 lines)
      opt_explain_json.cc (1837 lines)

      Even if we omit the files for JSON it's still a lot of code.

      Take also into account that:

      • The above files use a lot of new classes that are not present in the current MariaDB 5.5/10.0 code
      • Changes in other files will be required

      The main functionality for the EXPLAIN command can found in opt_explain.cc.
      Here we find the following classes:

      /* A base for all Explain_* classes */
      class Explain
      
      /* Explain_no_table class outputs a trivial EXPLAIN row
         with "extra" column */
      class Explain_no_table: public Explain
      
      /* Explain_union_result class outputs EXPLAIN row for UNION */
      class Explain_union_result : public Explain
      
      /* Common base class for Explain_join and Explain_table */
      class Explain_table_base : public Explain
      
      /* Explain_join class produces EXPLAIN output for JOINs */
      class Explain_join : public Explain_table_base
      
      /* Explain_table class produce EXPLAIN output for queries
         without top-level JOIN */
      class Explain_table: public Explain_table_base
      

      The above hierarchy looks quite unnatural for me (if not to say more): after all the explain info go to the records of the same format.

      The implementations of the methods of the classes essentially copies the old code of from select_describe() making a lot of renaming. I don't understand why this re-writing was undertaken at all.

      3.

      We agree that the schema I suggested should be implemented in a lazy manner: first we should get a proof of concept for the DELETE command. It will take not more than 1 week (hopefully less)

        Issue Links

          Activity

          Hide
          Sergei Petrunia added a comment -

          Decided I'll try with UPDATE (or DELETE) the following:

          • Save a "mini-plan" of the UPDATE in a custom structure.
          • Let both EXPLAIN and SHOW EXPLAIN walk it.
          • think of whether/how we could produce JSON output from it. (the question is: why does Oracle's way of producing traditional/json require so much code (in LOC)?)
          Show
          Sergei Petrunia added a comment - Decided I'll try with UPDATE (or DELETE) the following: Save a "mini-plan" of the UPDATE in a custom structure. Let both EXPLAIN and SHOW EXPLAIN walk it. think of whether/how we could produce JSON output from it. (the question is: why does Oracle's way of producing traditional/json require so much code (in LOC)?)
          Hide
          Sergei Petrunia added a comment -

          Started to code the above. No visible output yet.

          Show
          Sergei Petrunia added a comment - Started to code the above. No visible output yet.
          Hide
          Sergei Petrunia added a comment -

          Committed the patch with working EXPLAIN DELETE. SHOW EXPLAIN for DELETE commands works, too. Need to discuss it.

          Show
          Sergei Petrunia added a comment - Committed the patch with working EXPLAIN DELETE. SHOW EXPLAIN for DELETE commands works, too. Need to discuss it.
          Hide
          Sergei Petrunia added a comment -

          Current status: [SHOW] EXPLAIN (UPDATE|DELETE) works, the tree is at lp:~maria-captains/maria/10.0-base-explain-update/.

          Show
          Sergei Petrunia added a comment - Current status: [SHOW] EXPLAIN (UPDATE|DELETE) works, the tree is at lp:~maria-captains/maria/10.0-base-explain-update/.
          Hide
          Sergei Petrunia added a comment -

          Had another discussion about the issue. Conclusions are as follows:

          == Goals ==
          The goals of the data structure are

          • it should allow to generate both tabular and JSON-like output.
          • it should survive JOIN::cleanup()

          The structure should be cheap to generate. Most likely, the structure will be
          generated at the end of optimization phase. One needs to note that "end of
          optimization phase" may be tricky for SELECTs (which optimize the subqueries
          lazily). Because of that, one must be prepared that the data structure is
          constructed incrementally in whatever order the optimizer happens to produce
          query plans for parts of the query.

          == Solution ==

          We will have our own tree of SELECTs/UNIONs (the other option was to introduce
          a navigation API over existing SELECT_LEX trees).

          The structure will be a tree of SELECTs/UNIONs.

          A select is defined as
          select

          { select_id, select_type ordered_list < table_access > tables; }

          table_access

          { # All the fields that a row has in tabular output form }

          table_access fields will not be string representation. They will be pointers, or
          "source data"

          • table_name will point to table's name
          • "key_length" will be a list of integers (index_merge has a list of
            lengths)
          • "key" will be a key number
          • "Extra" will be a set of tags for "Using where", "Using index condition",
            "Using index", etc, etc.

          SJM nests will not have a separate "select" element. Instead, table_access is
          permitted to have extra fields pointing (or storing indexes of) other
          table_access elements.

          Show
          Sergei Petrunia added a comment - Had another discussion about the issue. Conclusions are as follows: == Goals == The goals of the data structure are it should allow to generate both tabular and JSON-like output. it should survive JOIN::cleanup() The structure should be cheap to generate. Most likely, the structure will be generated at the end of optimization phase. One needs to note that "end of optimization phase" may be tricky for SELECTs (which optimize the subqueries lazily). Because of that, one must be prepared that the data structure is constructed incrementally in whatever order the optimizer happens to produce query plans for parts of the query. == Solution == We will have our own tree of SELECTs/UNIONs (the other option was to introduce a navigation API over existing SELECT_LEX trees). The structure will be a tree of SELECTs/UNIONs. A select is defined as select { select_id, select_type ordered_list < table_access > tables; } table_access { # All the fields that a row has in tabular output form } table_access fields will not be string representation. They will be pointers, or "source data" table_name will point to table's name "key_length" will be a list of integers (index_merge has a list of lengths) "key" will be a key number "Extra" will be a set of tags for "Using where", "Using index condition", "Using index", etc, etc. SJM nests will not have a separate "select" element. Instead, table_access is permitted to have extra fields pointing (or storing indexes of) other table_access elements.
          Hide
          Sergei Petrunia added a comment -

          After merge with 10.0-base, got conflict with DELETE ... RETURNING code

          Show
          Sergei Petrunia added a comment - After merge with 10.0-base, got conflict with DELETE ... RETURNING code
          Hide
          Sergei Petrunia added a comment -

          The second patch passes the tests. Submitted it for review.

          Show
          Sergei Petrunia added a comment - The second patch passes the tests. Submitted it for review.
          Hide
          Sergei Petrunia added a comment -

          Pushed into 10.0-base and 10.0

          Show
          Sergei Petrunia added a comment - Pushed into 10.0-base and 10.0

            People

            • Assignee:
              Sergei Petrunia
              Reporter:
              Igor Babaev
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0 minutes
                0m
                Logged:
                Time Spent - 4 days, 7 hours
                4d 7h