Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5633

Better Support For Partitions/Prune By Date

    Details

    • Type: Task
    • Status: Stalled
    • Priority: Major
    • Resolution: Unresolved
    • Fix Version/s: 10.2
    • Component/s: None
    • Labels:
      None

      Description

      I created a table based on a hash of a date:

      CREATE TABLE employees (
          id INT NOT NULL,
          fname VARCHAR(30),
          lname VARCHAR(30),
          hired DATE NOT NULL DEFAULT NOW(),
          separated DATE NOT NULL DEFAULT '9999-12-31',
          job_code INT,
          store_id INT
      )
      PARTITION BY HASH( DAYOFYEAR(hired) )
      PARTITIONS 45;
      

      I would think that the query engine could prune searches since the hash value is, in my situation, monotonically increasing, but it does not appear to be the case. In a range search, the optimizer should be able to put the lower and upper bounds of the range into the hash function to come up with the upper and lower bounds for pruning partitions.

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            serg Sergei Golubchik added a comment -

            Hashes don't work that way. They generally don't preserve the order, so even if the function would be monotonically increasing, the hash of it wouldn't be. That's why an optimizer cannot possibly guess the range of hash values from the range of hash arguments.

            Show
            serg Sergei Golubchik added a comment - Hashes don't work that way. They generally don't preserve the order, so even if the function would be monotonically increasing, the hash of it wouldn't be. That's why an optimizer cannot possibly guess the range of hash values from the range of hash arguments.
            Hide
            belugabehr BELUGABEHR added a comment - - edited

            If I may:

            I read from the MySQL Manual:

            When PARTITION BY HASH is used, MySQL determines which partition of num partitions to use based on the modulus of the result of the user function. In other words, for an expression expr, the partition in which the record is stored is partition number N, where N = MOD(expr, num).

            Given that remark, can date operators be special-cased to allow for this behavior? Given: PARTITION BY HASH( DAYOFYEAR(hired)) and the MOD of the hash function, it can be determined which range of partitions contain given date.

             
            -- Return count of all employees hired in the last seven days.  Only considers seven partitions.
            SELECT count(1) FROM employees WHERE hired >= (DAYOFYEAR(NOW()) - 7);
            
            Show
            belugabehr BELUGABEHR added a comment - - edited If I may: I read from the MySQL Manual: When PARTITION BY HASH is used, MySQL determines which partition of num partitions to use based on the modulus of the result of the user function. In other words, for an expression expr, the partition in which the record is stored is partition number N, where N = MOD(expr, num). Given that remark, can date operators be special-cased to allow for this behavior? Given: PARTITION BY HASH( DAYOFYEAR(hired)) and the MOD of the hash function, it can be determined which range of partitions contain given date. -- Return count of all employees hired in the last seven days. Only considers seven partitions. SELECT count(1) FROM employees WHERE hired >= (DAYOFYEAR(NOW()) - 7);
            Hide
            serg Sergei Golubchik added a comment -

            I see. Yes, to a certain extent it's possible. But it would need rather big changes in the optimizer, so we cannot implement it anytime soon.

            Show
            serg Sergei Golubchik added a comment - I see. Yes, to a certain extent it's possible. But it would need rather big changes in the optimizer, so we cannot implement it anytime soon.
            Hide
            belugabehr BELUGABEHR added a comment -

            More formally stated:

            CREATE TABLE members (
                firstname VARCHAR(25) NOT NULL,
                lastname VARCHAR(25) NOT NULL,
                username VARCHAR(16) NOT NULL,
                joined DATETIME NOT NULL
            )
            PARTITION BY HASH(TO_DAYS(joined))
            PARTITIONS 7;
            
            INSERT INTO members VALUES ('fred','flintstone','fflintstone','2014-01-04');
            INSERT INTO members VALUES ('barney','rubble','brubble','2014-01-05');
            INSERT INTO members VALUES ('homer','simpson','hsimpson','2014-01-06');
            INSERT INTO members VALUES ('jessica','rabbit','jrabbit','2014-01-07');
            
            SELECT TABLE_NAME,PARTITION_NAME,TABLE_ROWS FROM information_schema.`PARTITIONS` WHERE TABLE_NAME='members';
            
            TABLE_NAME PARTITION_NAME TABLE_ROWS
            members p0 1
            members p1 1
            members p2 1
            members p3 1
            members p4 0
            members p5 0
            members p6 0
            EXPLAIN PARTITIONS SELECT * FROM members WHERE joined BETWEEN '2014-01-04' AND '2014-01-06';
            
            id select_type table partitions type possible_keys key key_len ref rows Extra
            1 SIMPLE members p0,p1,p2,p3,p4,p5,p6 ALL         7 Using where

            The optimizer should be able to prune out all partitions except partitions: 1,2,3

            Show
            belugabehr BELUGABEHR added a comment - More formally stated: CREATE TABLE members ( firstname VARCHAR (25) NOT NULL, lastname VARCHAR (25) NOT NULL, username VARCHAR (16) NOT NULL, joined DATETIME NOT NULL ) PARTITION BY HASH(TO_DAYS(joined)) PARTITIONS 7; INSERT INTO members VALUES ('fred','flintstone','fflintstone','2014-01-04'); INSERT INTO members VALUES ('barney','rubble','brubble','2014-01-05'); INSERT INTO members VALUES ('homer','simpson','hsimpson','2014-01-06'); INSERT INTO members VALUES ('jessica','rabbit','jrabbit','2014-01-07'); SELECT TABLE_NAME,PARTITION_NAME,TABLE_ROWS FROM information_schema.`PARTITIONS` WHERE TABLE_NAME='members'; TABLE_NAME PARTITION_NAME TABLE_ROWS members p0 1 members p1 1 members p2 1 members p3 1 members p4 0 members p5 0 members p6 0 EXPLAIN PARTITIONS SELECT * FROM members WHERE joined BETWEEN '2014-01-04' AND '2014-01-06'; id select_type table partitions type possible_keys key key_len ref rows Extra 1 SIMPLE members p0,p1,p2,p3,p4,p5,p6 ALL         7 Using where The optimizer should be able to prune out all partitions except partitions: 1,2,3

              People

              • Assignee:
                Unassigned
                Reporter:
                belugabehr BELUGABEHR
              • Votes:
                1 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated: