Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-7539

All MyISAM partitions opened, despite optimizer's partition pruning

    Details

      Description

      This is an upstream bug:

      http://bugs.mysql.com/bug.php?id=75753

      Description

      In certain cases, the optimizer can eliminate partitions from a query plan, based on conditions in the WHERE clause. This is called partition pruning:

      http://dev.mysql.com/doc/refman/5.5/en/partitioning-pruning.html

      However, even when partition pruning is in use, the MYI and MYD files of all MyISAM partitions in a given table are opened, even if some of them won't be read.

      This causes problems for users who have tables with a lot of MyISAM partitions. A simply query targeting one partition of a table with 100 partitions can easily create thousands of file descriptors if there are a few concurrent users.

      How to repeat:

      I'll use one of the example tables from the partition pruning documentation:

      CREATE TABLE t2 (
          fname VARCHAR(50) NOT NULL,
          lname VARCHAR(50) NOT NULL,
          region_code TINYINT UNSIGNED NOT NULL,
          dob DATE NOT NULL
      ) ENGINE=MyISAM
      PARTITION BY RANGE( YEAR(dob) ) (
          PARTITION d0 VALUES LESS THAN (1970) ENGINE=MyISAM,
          PARTITION d1 VALUES LESS THAN (1975) ENGINE=MyISAM,
          PARTITION d2 VALUES LESS THAN (1980) ENGINE=MyISAM,
          PARTITION d3 VALUES LESS THAN (1985) ENGINE=MyISAM,
          PARTITION d4 VALUES LESS THAN (1990) ENGINE=MyISAM,
          PARTITION d5 VALUES LESS THAN (2000) ENGINE=MyISAM,
          PARTITION d6 VALUES LESS THAN (2005) ENGINE=MyISAM,
          PARTITION d7 VALUES LESS THAN MAXVALUE ENGINE=MyISAM
      );
      

      If we insert a row:

      INSERT INTO t2 VALUES ('John', 'Doe', 1, '1982-06-23');
      

      Now let's SELECT from the table. We know partition pruning will be used in this particular query:

      mysql> EXPLAIN PARTITIONS SELECT * FROM t2 WHERE dob = '1982-06-23';
      +----+-------------+-------+------------+--------+---------------+------+---------+------+------+-------+
      | id | select_type | table | partitions | type   | possible_keys | key  | key_len | ref  | rows | Extra |
      +----+-------------+-------+------------+--------+---------------+------+---------+------+------+-------+
      |  1 | SIMPLE      | t2    | d3         | system | NULL          | NULL | NULL    | NULL |    1 |       |
      +----+-------------+-------+------------+--------+---------------+------+---------+------+------+-------+
      1 row in set (0.00 sec)
      

      The optimizer says the results will only be in partition "d3", but let's see what files MySQL actually opens when the query is executed.

      Shell 1

      mysql> FLUSH TABLES;
      Query OK, 0 rows affected (0.00 sec)
      

      Shell 2

      [gmontee@localhost ~]$ sudo strace -p `pidof mysqld` -ff -o `pidof mysqld`.out -e trace=open
      Process 4295 attached with 17 threads
      

      Shell 1

      mysql> SELECT * FROM t2 WHERE dob = '1982-06-23';
      +-------+-------+-------------+------------+
      | fname | lname | region_code | dob        |
      +-------+-------+-------------+------------+
      | John  | Doe   |           1 | 1982-06-23 |
      +-------+-------+-------------+------------+
      1 row in set (0.00 sec)
      

      Shell 2

      ^CProcess 4295 detached
      Process 4297 detached
      Process 4298 detached
      Process 4299 detached
      Process 4300 detached
      Process 4301 detached
      Process 4302 detached
      Process 4303 detached
      Process 4304 detached
      Process 4305 detached
      Process 4306 detached
      Process 4308 detached
      Process 4309 detached
      Process 4310 detached
      Process 4311 detached
      Process 4316 detached
      Process 4319 detached
      [gmontee@localhost ~]$ cat 4295.out.4319
      open("./tmp/t2.frm", O_RDONLY)          = 16
      open("./tmp/t2.par", O_RDONLY)          = 17
      open("./tmp/t2.par", O_RDONLY)          = 16
      open("/var/lib/mysql/tmp/t2#P#d0.MYI", O_RDWR) = 16
      open("./tmp/t2#P#d0.MYD", O_RDWR)       = 17
      open("/var/lib/mysql/tmp/t2#P#d1.MYI", O_RDWR) = 18
      open("./tmp/t2#P#d1.MYD", O_RDWR)       = 19
      open("/var/lib/mysql/tmp/t2#P#d2.MYI", O_RDWR) = 20
      open("./tmp/t2#P#d2.MYD", O_RDWR)       = 21
      open("/var/lib/mysql/tmp/t2#P#d3.MYI", O_RDWR) = 22
      open("./tmp/t2#P#d3.MYD", O_RDWR)       = 23
      open("/var/lib/mysql/tmp/t2#P#d4.MYI", O_RDWR) = 24
      open("./tmp/t2#P#d4.MYD", O_RDWR)       = 25
      open("/var/lib/mysql/tmp/t2#P#d5.MYI", O_RDWR) = 26
      open("./tmp/t2#P#d5.MYD", O_RDWR)       = 27
      open("/var/lib/mysql/tmp/t2#P#d6.MYI", O_RDWR) = 28
      open("./tmp/t2#P#d6.MYD", O_RDWR)       = 29
      open("/var/lib/mysql/tmp/t2#P#d7.MYI", O_RDWR) = 30
      open("./tmp/t2#P#d7.MYD", O_RDWR)       = 31
      

      Suggested fix

      If partition pruning is in use, it is a waste to open partitions that will not be checked.

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

              Hide
              GeoffMontee Geoff Montee added a comment -

              I've confirmed that Aria has the same behavior in 10.0.

              Shell 1

              MariaDB [tmp]> CREATE TABLE t2 (
                  ->     fname VARCHAR(50) NOT NULL,
                  ->     lname VARCHAR(50) NOT NULL,
                  ->     region_code TINYINT UNSIGNED NOT NULL,
                  ->     dob DATE NOT NULL
                  -> ) ENGINE=Aria
                  -> PARTITION BY RANGE( YEAR(dob) ) (
                  ->     PARTITION d0 VALUES LESS THAN (1970) ENGINE=Aria,
                  ->     PARTITION d1 VALUES LESS THAN (1975) ENGINE=Aria,
                  ->     PARTITION d2 VALUES LESS THAN (1980) ENGINE=Aria,
                  ->     PARTITION d3 VALUES LESS THAN (1985) ENGINE=Aria,
                  ->     PARTITION d4 VALUES LESS THAN (1990) ENGINE=Aria,
                  ->     PARTITION d5 VALUES LESS THAN (2000) ENGINE=Aria,
                  ->     PARTITION d6 VALUES LESS THAN (2005) ENGINE=Aria,
                  ->     PARTITION d7 VALUES LESS THAN MAXVALUE ENGINE=Aria
                  -> );
              Query OK, 0 rows affected (0.09 sec)
              
              MariaDB [tmp]> INSERT INTO t2 VALUES ('John', 'Doe', 1, '1982-06-23');
              Query OK, 1 row affected (0.04 sec)
              
              MariaDB [tmp]> EXPLAIN PARTITIONS SELECT * FROM t2 WHERE dob = '1982-06-23';
              +------+-------------+-------+------------+--------+---------------+------+---------+------+------+-------+
              | id   | select_type | table | partitions | type   | possible_keys | key  | key_len | ref  | rows | Extra |
              +------+-------------+-------+------------+--------+---------------+------+---------+------+------+-------+
              |    1 | SIMPLE      | t2    | d3         | system | NULL          | NULL | NULL    | NULL |    1 |       |
              +------+-------------+-------+------------+--------+---------------+------+---------+------+------+-------+
              1 row in set (0.00 sec)
              
              MariaDB [tmp]> FLUSH TABLES;
              Query OK, 0 rows affected (0.02 sec)
              

              Shell 2

              [gmontee@localhost ~]$ sudo strace -p `pidof mysqld` -ff -o `pidof mysqld`.out -e trace=open
              Process 2443 attached with 24 threads
              

              Shell 1

              MariaDB [tmp]> SELECT * FROM t2 WHERE dob = '1982-06-23';
              +-------+-------+-------------+------------+
              | fname | lname | region_code | dob        |
              +-------+-------+-------------+------------+
              | John  | Doe   |           1 | 1982-06-23 |
              +-------+-------+-------------+------------+
              1 row in set (0.01 sec)
              

              Shell 2

              ^CProcess 2443 detached
              Process 2489 detached
              Process 2490 detached
              Process 2491 detached
              Process 2492 detached
              Process 2493 detached
              Process 2494 detached
              Process 2495 detached
              Process 2496 detached
              Process 2497 detached
              Process 2498 detached
              Process 2552 detached
              Process 2553 detached
              Process 2554 detached
              Process 2555 detached
              Process 2556 detached
              Process 2567 detached
              Process 2568 detached
              Process 2569 detached
              Process 2570 detached
              Process 2571 detached
              Process 2574 detached
              Process 2576 detached
              Process 13965 detached
              [gmontee@localhost ~]$ cat 2443.out.13965 
              open("./tmp/t2.frm", O_RDONLY)          = 16
              open("./tmp/t2.par", O_RDONLY)          = 16
              open("./tmp/t2.par", O_RDONLY)          = 16
              open("/var/lib/mysql/tmp/t2#P#d0.MAI", O_RDWR) = 16
              open("./tmp/t2#P#d0.MAD", O_RDWR)       = 17
              open("/var/lib/mysql/tmp/t2#P#d1.MAI", O_RDWR) = 20
              open("./tmp/t2#P#d1.MAD", O_RDWR)       = 21
              open("/var/lib/mysql/tmp/t2#P#d2.MAI", O_RDWR) = 22
              open("./tmp/t2#P#d2.MAD", O_RDWR)       = 23
              open("/var/lib/mysql/tmp/t2#P#d3.MAI", O_RDWR) = 24
              open("./tmp/t2#P#d3.MAD", O_RDWR)       = 25
              open("/var/lib/mysql/tmp/t2#P#d4.MAI", O_RDWR) = 26
              open("./tmp/t2#P#d4.MAD", O_RDWR)       = 27
              open("/var/lib/mysql/tmp/t2#P#d5.MAI", O_RDWR) = 28
              open("./tmp/t2#P#d5.MAD", O_RDWR)       = 29
              open("/var/lib/mysql/tmp/t2#P#d6.MAI", O_RDWR) = 30
              open("./tmp/t2#P#d6.MAD", O_RDWR)       = 31
              open("/var/lib/mysql/tmp/t2#P#d7.MAI", O_RDWR) = 32
              open("./tmp/t2#P#d7.MAD", O_RDWR)       = 33
              
              Show
              GeoffMontee Geoff Montee added a comment - I've confirmed that Aria has the same behavior in 10.0. Shell 1 MariaDB [tmp]> CREATE TABLE t2 ( -> fname VARCHAR(50) NOT NULL, -> lname VARCHAR(50) NOT NULL, -> region_code TINYINT UNSIGNED NOT NULL, -> dob DATE NOT NULL -> ) ENGINE=Aria -> PARTITION BY RANGE( YEAR(dob) ) ( -> PARTITION d0 VALUES LESS THAN (1970) ENGINE=Aria, -> PARTITION d1 VALUES LESS THAN (1975) ENGINE=Aria, -> PARTITION d2 VALUES LESS THAN (1980) ENGINE=Aria, -> PARTITION d3 VALUES LESS THAN (1985) ENGINE=Aria, -> PARTITION d4 VALUES LESS THAN (1990) ENGINE=Aria, -> PARTITION d5 VALUES LESS THAN (2000) ENGINE=Aria, -> PARTITION d6 VALUES LESS THAN (2005) ENGINE=Aria, -> PARTITION d7 VALUES LESS THAN MAXVALUE ENGINE=Aria -> ); Query OK, 0 rows affected (0.09 sec) MariaDB [tmp]> INSERT INTO t2 VALUES ('John', 'Doe', 1, '1982-06-23'); Query OK, 1 row affected (0.04 sec) MariaDB [tmp]> EXPLAIN PARTITIONS SELECT * FROM t2 WHERE dob = '1982-06-23'; +------+-------------+-------+------------+--------+---------------+------+---------+------+------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------------+--------+---------------+------+---------+------+------+-------+ | 1 | SIMPLE | t2 | d3 | system | NULL | NULL | NULL | NULL | 1 | | +------+-------------+-------+------------+--------+---------------+------+---------+------+------+-------+ 1 row in set (0.00 sec) MariaDB [tmp]> FLUSH TABLES; Query OK, 0 rows affected (0.02 sec) Shell 2 [gmontee@localhost ~]$ sudo strace -p `pidof mysqld` -ff -o `pidof mysqld`.out -e trace=open Process 2443 attached with 24 threads Shell 1 MariaDB [tmp]> SELECT * FROM t2 WHERE dob = '1982-06-23'; +-------+-------+-------------+------------+ | fname | lname | region_code | dob | +-------+-------+-------------+------------+ | John | Doe | 1 | 1982-06-23 | +-------+-------+-------------+------------+ 1 row in set (0.01 sec) Shell 2 ^CProcess 2443 detached Process 2489 detached Process 2490 detached Process 2491 detached Process 2492 detached Process 2493 detached Process 2494 detached Process 2495 detached Process 2496 detached Process 2497 detached Process 2498 detached Process 2552 detached Process 2553 detached Process 2554 detached Process 2555 detached Process 2556 detached Process 2567 detached Process 2568 detached Process 2569 detached Process 2570 detached Process 2571 detached Process 2574 detached Process 2576 detached Process 13965 detached [gmontee@localhost ~]$ cat 2443.out.13965 open( "./tmp/t2.frm" , O_RDONLY) = 16 open( "./tmp/t2.par" , O_RDONLY) = 16 open( "./tmp/t2.par" , O_RDONLY) = 16 open( "/ var /lib/mysql/tmp/t2#P#d0.MAI" , O_RDWR) = 16 open( "./tmp/t2#P#d0.MAD" , O_RDWR) = 17 open( "/ var /lib/mysql/tmp/t2#P#d1.MAI" , O_RDWR) = 20 open( "./tmp/t2#P#d1.MAD" , O_RDWR) = 21 open( "/ var /lib/mysql/tmp/t2#P#d2.MAI" , O_RDWR) = 22 open( "./tmp/t2#P#d2.MAD" , O_RDWR) = 23 open( "/ var /lib/mysql/tmp/t2#P#d3.MAI" , O_RDWR) = 24 open( "./tmp/t2#P#d3.MAD" , O_RDWR) = 25 open( "/ var /lib/mysql/tmp/t2#P#d4.MAI" , O_RDWR) = 26 open( "./tmp/t2#P#d4.MAD" , O_RDWR) = 27 open( "/ var /lib/mysql/tmp/t2#P#d5.MAI" , O_RDWR) = 28 open( "./tmp/t2#P#d5.MAD" , O_RDWR) = 29 open( "/ var /lib/mysql/tmp/t2#P#d6.MAI" , O_RDWR) = 30 open( "./tmp/t2#P#d6.MAD" , O_RDWR) = 31 open( "/ var /lib/mysql/tmp/t2#P#d7.MAI" , O_RDWR) = 32 open( "./tmp/t2#P#d7.MAD" , O_RDWR) = 33
              Hide
              elenst Elena Stepanova added a comment -

              Upstream bug was closed without fixing:

              "This is a design limitation, MyISAM have one file handle open for the .MYI file shared by all instances of one table, and one file handle open for each table instance for the .MYD file. And the ha_partition generic partitioning engine always open all partitions when opening a table/handler".. more details in #64498

              Show
              elenst Elena Stepanova added a comment - Upstream bug was closed without fixing: "This is a design limitation, MyISAM have one file handle open for the .MYI file shared by all instances of one table, and one file handle open for each table instance for the .MYD file. And the ha_partition generic partitioning engine always open all partitions when opening a table/handler".. more details in #64498
              Hide
              elenst Elena Stepanova added a comment -

              After discussing it with Geoff on IRC, closing as 'Won't fix'.
              Please feel free to re-open if anything changes.

              Show
              elenst Elena Stepanova added a comment - After discussing it with Geoff on IRC, closing as 'Won't fix'. Please feel free to re-open if anything changes.

                People

                • Assignee:
                  Unassigned
                  Reporter:
                  GeoffMontee Geoff Montee
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  2 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved: