Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6529

EITS ANALYZE uses disk space inefficiently for VARCHAR columns

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 10.0.12
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Filing this https://mariadb.atlassian.net/browse/MDEV-6181?focusedCommentId=57209&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-57209

      as a separate issue:

      Motivated by CSC#7897, I've tried to see how much space can ANALYZE use. (So far my understanding was that max. temp dir usage would be rougly equal to size of the table w/o indexes).

      Let's try this:

      create table ten(a int);
      insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
      
      create table one_k(a int);
      insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C;
      
      create table t1 (pk int primary key, col1 varchar(100)) charset=utf8;
      insert into t1 select A.a+B.a*1000, concat('val-', A.a+B.a*1000) from one_k A, one_k B;
      

      The table has 1M rows. Rows in the VARCHAR(100) column occupy less than
      300 bytes (which is typical).

      The table on disk:

       -rw-rw---- 1 psergey psergey  44M Aug  1 23:09 t1.ibd
      

      Now,

      analyze table t1 persistent for all;
      
        Breakpoint 4, unique_write_to_file_with_count (key=0x7ffebf2ce0a0 "\005", count=1, unique=0x7ffebf047ef0) at /home/psergey/dev2/10.0/sql/uniques.cc:53
      (gdb) print key
        $7 = (uchar *) 0x7ffebf2ce0a0 "\005"
      (gdb)  
      (gdb) p unique->size
        $8 = 302
      

      It's writing 300 bytes, the unpacked length.

      In total, we get:

        Breakpoint 3, Unique::flush (this=0x7ffebf047ef0) at /home/psergey/dev2/10.0/sql/uniques.cc:379
        $46 = 281,734,200
        Breakpoint 3, Unique::flush (this=0x7ffebf047ef0) at /home/psergey/dev2/10.0/sql/uniques.cc:379
        $48 = 297,386,100
      ...
        Breakpoint 3, Unique::flush (this=0x7ffebf047a30) at /home/psergey/dev2/10.0/sql/uniques.cc:379
        $47 = 4,194,304
      

      300MBit = 286 M, i.e. ANALYZE may require much more space than is occupied by the table (44M)

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

              There are no comments yet on this issue.

                People

                • Assignee:
                  igor Igor Babaev
                  Reporter:
                  psergey Sergei Petrunia
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  1 Start watching this issue

                  Dates

                  • Created:
                    Updated: