Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-4276

Re-run DBT-3 benchmark with new statistics: 5.6 and 10.0

    Details

    • Type: Task
    • Status: In Progress
    • Priority: Critical
    • Resolution: Unresolved
    • Fix Version/s: N/A
    • Component/s: None

      Description

      We need a re-run of DBT-3 benchmark with the new features:

      • Engine-independent statistics in MariaDB 10.0 (note: Timour will provide a 10.0-based tree with latest fixes to EITS and subquery optimizations)
      • Innodb's persistent statistics in MySQL-5.6

      The primary goal is to explore what query plans will be used with Engine-independent table statistics (further called EITS).

      EITS are persistent, predictable (statistics calculation has no randomness elements), and precise. The statistics need to be collected manually, and there is an optimizer setting to get them to be used (see https://kb.askmonty.org/en/engine-independent-table-statistics/ for details).

      MySQL-5.6 and their persistent statistics should be used as something to compare against. I suppose, increasing innodb_stats_persistent_sample_pages should reduce the number of chosen plans (to one plan even?) ?

      The benchmark should be a DBT-3 run. It should be IO-bound.

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            timour Timour Katchaounov added a comment -

            The benchmark should be run with the following tree:
            lp:~maria-captains/maria/10.0-mdev83

            This tree is based on MWL#253, which adds predicate selectivity estimates. The tree of
            10.0-mdev83 implements static (done during optimization) pushdown of subquery predicates.
            In order to test the new features, the following optimizer switches must be enabled:

            set @@optimizer_use_condition_selectivity=3;
            set @@optimizer_switch='expensive_pred_static_pushdown=on';

            In addition, in order to get proper selectivity estimates that let the optimizer choose the right plan
            for Q20, specifically Q20 should be run with the following indexes:
            create index i_p_name on part(p_name);
            create index i_n_name on nation(n_name);
            Notice that these indexes do not affect execution performance, because they are not used
            during execution. Once there are histograms that would allow proper selectivity estimates,
            these indexes will not be needed. Make sure that these indexes exist only when Q20 is
            being run.

            Show
            timour Timour Katchaounov added a comment - The benchmark should be run with the following tree: lp:~maria-captains/maria/10.0-mdev83 This tree is based on MWL#253, which adds predicate selectivity estimates. The tree of 10.0-mdev83 implements static (done during optimization) pushdown of subquery predicates. In order to test the new features, the following optimizer switches must be enabled: set @@optimizer_use_condition_selectivity=3; set @@optimizer_switch='expensive_pred_static_pushdown=on'; In addition, in order to get proper selectivity estimates that let the optimizer choose the right plan for Q20, specifically Q20 should be run with the following indexes: create index i_p_name on part(p_name); create index i_n_name on nation(n_name); Notice that these indexes do not affect execution performance, because they are not used during execution. Once there are histograms that would allow proper selectivity estimates, these indexes will not be needed. Make sure that these indexes exist only when Q20 is being run.
            Hide
            timour Timour Katchaounov added a comment -

            To run the latest implementation of mdev-83 that includes histograms from MWL#253,
            against already created database with previous version of 10.0:
            use mysql;
            drop table column_stats;
            CREATE TABLE IF NOT EXISTS column_stats (db_name varchar(64) NOT NULL, table_name varchar(64) NOT NULL, column_name varchar(64) NOT NULL, min_value varchar(255) DEFAULT NULL, max_value varchar(255) DEFAULT NULL, nulls_ratio decimal(12,4) DEFAULT NULL, avg_length decimal(12,4) DEFAULT NULL, avg_frequency decimal(12,4) DEFAULT NULL, hist_size tinyint unsigned, histogram varbinary(255), PRIMARY KEY (db_name,table_name,column_name) ) ENGINE=MyISAM CHARACTER SET utf8 COLLATE utf8_bin comment='Statistics on Columns';

            To collect table independent statistics with ANALYZE TABLE set:
            set @@use_stat_tables='preferably';

            In addition to other optimizer setttings, make sure that you set:
            set @@optimizer_use_condition_selectivity=4;
            set @@histogram_size=24;
            set @@optimizer_switch='expensive_pred_static_pushdown=on';

            Specifically Q20 benefits a lot from:
            set @@optimizer_switch='mrr=on,mrr_cost_based=off,mrr_sort_keys=on';

            Show
            timour Timour Katchaounov added a comment - To run the latest implementation of mdev-83 that includes histograms from MWL#253, against already created database with previous version of 10.0: use mysql; drop table column_stats; CREATE TABLE IF NOT EXISTS column_stats (db_name varchar(64) NOT NULL, table_name varchar(64) NOT NULL, column_name varchar(64) NOT NULL, min_value varchar(255) DEFAULT NULL, max_value varchar(255) DEFAULT NULL, nulls_ratio decimal(12,4) DEFAULT NULL, avg_length decimal(12,4) DEFAULT NULL, avg_frequency decimal(12,4) DEFAULT NULL, hist_size tinyint unsigned, histogram varbinary(255), PRIMARY KEY (db_name,table_name,column_name) ) ENGINE=MyISAM CHARACTER SET utf8 COLLATE utf8_bin comment='Statistics on Columns'; To collect table independent statistics with ANALYZE TABLE set: set @@use_stat_tables='preferably'; In addition to other optimizer setttings, make sure that you set: set @@optimizer_use_condition_selectivity=4; set @@histogram_size=24; set @@optimizer_switch='expensive_pred_static_pushdown=on'; Specifically Q20 benefits a lot from: set @@optimizer_switch='mrr=on,mrr_cost_based=off,mrr_sort_keys=on';
            Hide
            timour Timour Katchaounov added a comment -

            To check how to re-create the statistics tables check the latest version of the file 'scripts/mysql_system_tables.sql'.
            Currently the SQL DDL for the stats tables is:

            use mysql;
            drop table table_stats, column_stats, index_stats;

            CREATE TABLE IF NOT EXISTS table_stats (db_name varchar(64) NOT NULL, table_name varchar(64) NOT NULL, cardinality bigint(21) unsigned DEFAULT NULL, PRIMARY KEY (db_name,table_name) ) ENGINE=MyISAM CHARACTER SET utf8 COLLATE utf8_bin comment='Statistics on Tables';

            CREATE TABLE IF NOT EXISTS column_stats (db_name varchar(64) NOT NULL, table_name varchar(64) NOT NULL, column_name varchar(64) NOT NULL, min_value varchar(255) DEFAULT NULL, max_value varchar(255) DEFAULT NULL, nulls_ratio decimal(12,4) DEFAULT NULL, avg_length decimal(12,4) DEFAULT NULL, avg_frequency decimal(12,4) DEFAULT NULL, hist_size tinyint unsigned, hist_type enum('SINGLE_PREC_HB','DOUBLE_PREC_HB'), histogram varbinary(255), PRIMARY KEY (db_name,table_name,column_name) ) ENGINE=MyISAM CHARACTER SET utf8 COLLATE utf8_bin comment='Statistics on Columns';

            CREATE TABLE IF NOT EXISTS index_stats (db_name varchar(64) NOT NULL, table_name varchar(64) NOT NULL, index_name varchar(64) NOT NULL, prefix_arity int(11) unsigned NOT NULL, avg_frequency decimal(12,4) DEFAULT NULL, PRIMARY KEY (db_name,table_name,index_name,prefix_arity) ) ENGINE=MyISAM CHARACTER SET utf8 COLLATE utf8_bin comment='Statistics on Indexes';

            Then run ANALYZE ... .

            Show
            timour Timour Katchaounov added a comment - To check how to re-create the statistics tables check the latest version of the file 'scripts/mysql_system_tables.sql'. Currently the SQL DDL for the stats tables is: use mysql; drop table table_stats, column_stats, index_stats; CREATE TABLE IF NOT EXISTS table_stats (db_name varchar(64) NOT NULL, table_name varchar(64) NOT NULL, cardinality bigint(21) unsigned DEFAULT NULL, PRIMARY KEY (db_name,table_name) ) ENGINE=MyISAM CHARACTER SET utf8 COLLATE utf8_bin comment='Statistics on Tables'; CREATE TABLE IF NOT EXISTS column_stats (db_name varchar(64) NOT NULL, table_name varchar(64) NOT NULL, column_name varchar(64) NOT NULL, min_value varchar(255) DEFAULT NULL, max_value varchar(255) DEFAULT NULL, nulls_ratio decimal(12,4) DEFAULT NULL, avg_length decimal(12,4) DEFAULT NULL, avg_frequency decimal(12,4) DEFAULT NULL, hist_size tinyint unsigned, hist_type enum('SINGLE_PREC_HB','DOUBLE_PREC_HB'), histogram varbinary(255), PRIMARY KEY (db_name,table_name,column_name) ) ENGINE=MyISAM CHARACTER SET utf8 COLLATE utf8_bin comment='Statistics on Columns'; CREATE TABLE IF NOT EXISTS index_stats (db_name varchar(64) NOT NULL, table_name varchar(64) NOT NULL, index_name varchar(64) NOT NULL, prefix_arity int(11) unsigned NOT NULL, avg_frequency decimal(12,4) DEFAULT NULL, PRIMARY KEY (db_name,table_name,index_name,prefix_arity) ) ENGINE=MyISAM CHARACTER SET utf8 COLLATE utf8_bin comment='Statistics on Indexes'; Then run ANALYZE ... .

              People

              • Assignee:
                axel Axel Schwenke
                Reporter:
                psergey Sergei Petrunia
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated: