Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-738

LP:686353 - 2.7x performance regression with correlated indexes and sort_intersection in maria-5.1-wl21

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: N/A
    • Component/s: Optimizer
    • Labels:

      Description

      The following query against the DBT-3 dataset

      SELECT count(*) FROM lineitem WHERE
      ( l_receiptDATE BETWEEN '1992-11-01' AND '1992-11-29' ) AND l_shipdate < '1992-11-15';
      

      is 10x times slower when executed with sort_intersection.

      Igor says that this is " not interesting because lineitem is naturally ordered by l_receiptDATE and l_receiptDATE is strongly correlated with l_shipdate" , " in the case when indexes are correlated intersection causes degradation: this is a well know fact.".

      At the same time, the data set is supposedly standard and in real life, almost all date-based indexes will be naturally ordered and correlated to the PK.

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            philipstoev Philip Stoev added a comment -

            Re: 10x performance regression with correlated indexes and sort_intersection in maria-5.1-wl21
            1. Data volume (DBT-3 factor)
            2. Database engine MyISAM
            3. Indexes used

            id: 1
            select_type: SIMPLE
            table: lineitem
            type: range
            possible_keys: i_l_shipdate,i_l_receiptdate
            key: i_l_receiptdate,i_l_shipdate
            key_len: 4,4
            ref: NULL
            rows: 414
            Extra: Using sort_intersect(i_l_receiptdate,i_l_shipdate); Using where

            4. Platform.
            Linux eve 2.6.33.3-85.fc13.i686.PAE #1 SMP Thu May 6 18:27:11 UTC 2010 i686 i686 i386 GNU/Linux
            Fedora release 13 (Goddard)

            5. Execution time

            in maria-5.1-wl21 - 0.27 sec , Handler_read_next 72166 , Handler_read_rnd 6296
            in maria-5.1 - 0.10 sec, Handler_read_next 7133

            Show
            philipstoev Philip Stoev added a comment - Re: 10x performance regression with correlated indexes and sort_intersection in maria-5.1-wl21 1. Data volume (DBT-3 factor) 2. Database engine MyISAM 3. Indexes used id: 1 select_type: SIMPLE table: lineitem type: range possible_keys: i_l_shipdate,i_l_receiptdate key: i_l_receiptdate,i_l_shipdate key_len: 4,4 ref: NULL rows: 414 Extra: Using sort_intersect(i_l_receiptdate,i_l_shipdate); Using where 4. Platform. Linux eve 2.6.33.3-85.fc13.i686.PAE #1 SMP Thu May 6 18:27:11 UTC 2010 i686 i686 i386 GNU/Linux Fedora release 13 (Goddard) 5. Execution time in maria-5.1-wl21 - 0.27 sec , Handler_read_next 72166 , Handler_read_rnd 6296 in maria-5.1 - 0.10 sec, Handler_read_next 7133
            Hide
            philipstoev Philip Stoev added a comment -

            Re: 10x performance regression with correlated indexes and sort_intersection in maria-5.1-wl21
            DBT3 factor 0.1

            Show
            philipstoev Philip Stoev added a comment - Re: 10x performance regression with correlated indexes and sort_intersection in maria-5.1-wl21 DBT3 factor 0.1
            Hide
            arjenlentz Arjen Lentz added a comment -

            Re: [Bug 686353] [NEW] 10x performance regression with correlated indexes and sort_intersection in maria-5.1-wl21
            Hi Philip, Igor, all

            On 07/12/2010, at 5:45 PM, Philip Stoev wrote:
            > The following query against the DBT-3 dataset
            >
            > SELECT count FROM lineitem WHERE ( l_receiptDATE BETWEEN
            > '1992-11-01'
            > AND '1992-11-29' ) AND l_shipdate < '1992-11-15';
            >
            > is 10x times slower when executed with sort_intersection.
            >
            > Igor says that this is " not interesting because lineitem is naturally
            > ordered by l_receiptDATE and l_receiptDATE is strongly correlated
            > with
            > l_shipdate" , " in the case when indexes are correlated intersection
            > causes degradation: this is a well know fact.".

            I don't care very much for the specific scenario, I think the main
            issue is performance regression.
            New features or changes should not make existing/old features slower -
            that has been one of the fundamental rules that Monty has worked with
            since the start as a guideline for development.

            Regards,
            Arjen.

            Arjen Lentz, Exec.Director @ Open Query (http://openquery.com)
            Remote expertise & maintenance for MySQL/MariaDB server environments.

            Follow us at http://openquery.com/blog/ & http://twitter.com/openquery

            Show
            arjenlentz Arjen Lentz added a comment - Re: [Bug 686353] [NEW] 10x performance regression with correlated indexes and sort_intersection in maria-5.1-wl21 Hi Philip, Igor, all On 07/12/2010, at 5:45 PM, Philip Stoev wrote: > The following query against the DBT-3 dataset > > SELECT count FROM lineitem WHERE ( l_receiptDATE BETWEEN > '1992-11-01' > AND '1992-11-29' ) AND l_shipdate < '1992-11-15'; > > is 10x times slower when executed with sort_intersection. > > Igor says that this is " not interesting because lineitem is naturally > ordered by l_receiptDATE and l_receiptDATE is strongly correlated > with > l_shipdate" , " in the case when indexes are correlated intersection > causes degradation: this is a well know fact.". I don't care very much for the specific scenario, I think the main issue is performance regression. New features or changes should not make existing/old features slower - that has been one of the fundamental rules that Monty has worked with since the start as a guideline for development. Regards, Arjen. – Arjen Lentz, Exec.Director @ Open Query ( http://openquery.com ) Remote expertise & maintenance for MySQL/MariaDB server environments. Follow us at http://openquery.com/blog/ & http://twitter.com/openquery
            Hide
            philipstoev Philip Stoev added a comment -

            dbt3 dataset scale 0.1
            LPexportBug686353_dbt3-s0.1.dump.bz2

            Show
            philipstoev Philip Stoev added a comment - dbt3 dataset scale 0.1 LPexportBug686353_dbt3-s0.1.dump.bz2
            Hide
            philipstoev Philip Stoev added a comment -

            Re: 10x performance regression with correlated indexes and sort_intersection in maria-5.1-wl21

            Show
            philipstoev Philip Stoev added a comment - Re: 10x performance regression with correlated indexes and sort_intersection in maria-5.1-wl21
            Hide
            ratzpo Rasmus Johansson added a comment -

            Launchpad bug id: 686353

            Show
            ratzpo Rasmus Johansson added a comment - Launchpad bug id: 686353
            Hide
            elenst Elena Stepanova added a comment -

            The optimization is off by default, and as the description says the scenario wasn't considered interesting to begin with, so I suppose it's safe to assume it won't be fixed. Closing as such.

            Show
            elenst Elena Stepanova added a comment - The optimization is off by default, and as the description says the scenario wasn't considered interesting to begin with, so I suppose it's safe to assume it won't be fixed. Closing as such.

              People

              • Assignee:
                igor Igor Babaev
                Reporter:
                philipstoev Philip Stoev
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: