Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-3798

Add EXPLAIN for UPDATE/DELETE (mwl #51)

    Details

    • Type: Task
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Fix Version/s: 10.0.5
    • Component/s: None
    • Labels:
      None

      Description

      High-Level Specification

      User interface

      EXPLAIN UPDATE and EXPLAIN DELETE statements will work and produce a tabular output similar to EXPLAIN SELECT.

      Implementation

      The primary challenge will be to change UPDATE/DELETE code to first produce action plan and then act on it (and not make decisions on the go as it currently does).

      What EXPLAIN will show

      • multi-table one will show the SELECT part as usual
      • single-table statements will show an equivalent of access method.

      Besides that, we want to know

      • if sorting will be used
      • whether the UPDATE will occur on the fly or not.
      • print or not "Using filesort"

      Low-Level Design

      Possible ways of the implemetations

      1. Backport of the the implementation from the mysql-5.6 code line.

      Estimate: 2.5 weeks (100 hrs)

      I got this estimate from comparison of the patch for ORDER BY with LIMIT that had been back-ported to 10.0-base with the mysql patch that added EXPLAIN for UPDATE/DELETE/INSERT. I took into account the fact that the latter was applied to the code that had been greatly changed from the mysql 5.5 code line. My personal attitude to the patch is extremely negative: the patch is overburdened with unneeded restructuring and unjustified new classes. The patch left the code for EXPLAIN interleaved with the code for SELECT and added interleaving of EXPLAIN for UPDATE/DELETE/INSERT for other code for this statements.

      2. A new implementation that will:

      • Introduce a special class for abstract EXPLAIN objects that would allow an easy generation of of the EXPLAIN output in any format
      • Add methods that builds the EXPLAIN objects for each type of interesting statements by the chosen execution plans (plans can be changed at some fixed points during the execution)
      • Clean the current code for SELECT removing removing all code related to EXPLAIN.

      Basically the above are the basic ideas under the low level design that I could suggest.

      In the future this design will allow us to include EXPLAIN into the information schema and, as a direct consequence, will allow us to select whatever we are interested in from the EXPLAIN output.

      My estimate for the time needed for this implementation is about 200 hrs (maybe even more). This is a lot, but it's only a fraction of the time that was in reality spent for SHOW EXPLAIN that DID NOT follow this design.

      The steps to implement this are:

      • Design the class ( 1 day)
      • Implement the class (2 days)
      • Use it in DELETE to implement explain (1 day dev, 4 hour QA)
      • Use it in UPDATE
      • Change SELECT to also use the class.

      In more details

      1.

      In my opinion a correct ideal schema of any implementation for the EXPLAIN command should be like this.

      1. The optimizer phase for the statement is completed (it means that the execution plan is chosen)
      2. An internal representation of the EXPLAIN info is built by the chosen plan for the executed statement
      3. The EXPLAIN output is built in the requested format

      The current implementation of the EXPLAIN command MariaDB 5.3/5.5 and in MySQL 5.5 skips 2. and builds the EXPLAIN output directly by the execution plan for the statement.

      The new implementation the EXPLAIN command for MySQL 5.6 also skips 2.

      Here are my consideration why step 2 is needed.

      • An execution plan is a volatile object (even after the optimization phase) : we can change it during the execution. So we need to save the EXPLAIN output between the commit points for this changes to support SHOW EXPLAIN. AFAIU the difficulty of the resolution of this problem for the current EXPLAIN output was the main source of troubles in the SHOW EXPLAIN project (see the log of the project in bzr).
      • It's much easier to form the same compact representation for different statements than to form the same EXPLAIN output
      • With the internal non-string representation you have more flexibility to build EXPLAIN output in a different format.

      Here are some examples of the differences between the internal representations of the EXPLAIN output and the current representation of it.
      Column 'possible keys':

      • now it's a list of key names
      • internal representation could contain just a key map for this key
        Column 'ref:
      • now contains a list of external representation for the ref items
      • internal representation could contain just a list of references to these items

      2.

      I looked at the current implementation of the EXPLAIN command in MySQL 5.6.7.

      Currently it contains the following new files:

      opt_explain.h (77 lines)
      opt_explain.cc (2089 lines)
      opt_explain_traditional.h (50 lines)
      opt_explain_traditional.cc (238 lines)
      opt_explain_format.h (592 lines)
      opt_explain_json.h (50 lines)
      opt_explain_json.cc (1837 lines)

      Even if we omit the files for JSON it's still a lot of code.

      Take also into account that:

      • The above files use a lot of new classes that are not present in the current MariaDB 5.5/10.0 code
      • Changes in other files will be required

      The main functionality for the EXPLAIN command can found in opt_explain.cc.
      Here we find the following classes:

      /* A base for all Explain_* classes */
      class Explain
      
      /* Explain_no_table class outputs a trivial EXPLAIN row
         with "extra" column */
      class Explain_no_table: public Explain
      
      /* Explain_union_result class outputs EXPLAIN row for UNION */
      class Explain_union_result : public Explain
      
      /* Common base class for Explain_join and Explain_table */
      class Explain_table_base : public Explain
      
      /* Explain_join class produces EXPLAIN output for JOINs */
      class Explain_join : public Explain_table_base
      
      /* Explain_table class produce EXPLAIN output for queries
         without top-level JOIN */
      class Explain_table: public Explain_table_base
      

      The above hierarchy looks quite unnatural for me (if not to say more): after all the explain info go to the records of the same format.

      The implementations of the methods of the classes essentially copies the old code of from select_describe() making a lot of renaming. I don't understand why this re-writing was undertaken at all.

      3.

      We agree that the schema I suggested should be implemented in a lazy manner: first we should get a proof of concept for the DELETE command. It will take not more than 1 week (hopefully less)

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                psergey Sergei Petrunia
                Reporter:
                igor Igor Babaev
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0 minutes
                  0m
                  Logged:
                  Time Spent - 4 days, 7 hours
                  4d 7h