Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6181

EITS could eat all tmpdir space and hang

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 10.0.10
    • Fix Version/s: 10.0.13
    • Component/s: None
    • Labels:
      None

      Description

      Hi,

      While testing the EIST feature, I tried to analyze a 24GB table with use_stat_tables='preferably'. It ended up with a full tmpdir without any warnings from MariaDB. (and a stuck ANALYZE TABLE command).

      I would have expected MariaDB to respect the tmp_table_size value while trying to compute the statistics ?

      Thanks and regards,
      Jocelyn Fournier

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

              Hide
              elenst Elena Stepanova added a comment - - edited

              So, with MyISAM I got similar results. The machine is not benchmark-tuned, so not only is it slow, but results can deviate a lot, on different reasons. But the trend still holds:

                big tmpdir small tmpdir
              200,000 rows 17.06 sec 9.70 sec
              1,000,000 rows 78.35 sec 723.99 sec
              5,000,000 rows 901.04 sec 6610.53 sec

              Same 64 errors in each case on the small tmpdir.

              Show
              elenst Elena Stepanova added a comment - - edited So, with MyISAM I got similar results. The machine is not benchmark-tuned, so not only is it slow, but results can deviate a lot, on different reasons. But the trend still holds:   big tmpdir small tmpdir 200,000 rows 17.06 sec 9.70 sec 1,000,000 rows 78.35 sec 723.99 sec 5,000,000 rows 901.04 sec 6610.53 sec Same 64 errors in each case on the small tmpdir.
              Hide
              elenst Elena Stepanova added a comment - - edited

              Based on IRC discussion, I ran the tests from a different angle.
              The table is of the same structure as above, but MyISAM, to rule out TokuDB specifics.
              All runs below are on 2,000,000 rows dataset.
              The rows are inserted in two different ways.
              INSERT1:
              insert t1 select seq, repeat(seq, 1000/length(seq)), date'2001-02-03' + interval seq second, seq % 20, seq, seq % 1000, 60 + seq % 25, seq % 100, repeat(seq, 1000/length(seq)), seq % 1000 from seq_1_to_2000000
              INSERT2:
              insert into t3 select seq, concat('http://my.url',seq), now(), seq, seq, seq, seq, seq, concat('script',seq), seq from seq_1_to_2000000

              So, while INSERTs are similar, they produce essentially different data layouts.

              The tests were run on tmpdirs with 3 different free space volume: 16 Gb, 8 Gb, 300 Mb.
              On 16 Gb, in both cases ANALYZE passes. At most it takes ~53% of tmpdir space, so it's slightly above 8 Gb.
              On 8 Gb and 300 Mb, it fails with the same infamous 64 disk space errors.
              Execution time:

                16 Gb 8 Gb 300 Mb
              INSERT1 332.07 sec 206.35 sec 920.29 sec
              INSERT2 314.79 sec 172.25 sec 2187.84 sec

              I believe the difference on 8 Gb can be ignored, it might well be a random fluctuation. The difference on 300 Mb is noticeable though. – both the horizontal and vertical difference. It is not random, I double-checked.

              Show
              elenst Elena Stepanova added a comment - - edited Based on IRC discussion, I ran the tests from a different angle. The table is of the same structure as above, but MyISAM, to rule out TokuDB specifics. All runs below are on 2,000,000 rows dataset. The rows are inserted in two different ways. INSERT1: insert t1 select seq, repeat(seq, 1000/length(seq)), date'2001-02-03' + interval seq second, seq % 20, seq, seq % 1000, 60 + seq % 25, seq % 100, repeat(seq, 1000/length(seq)), seq % 1000 from seq_1_to_2000000 INSERT2: insert into t3 select seq, concat('http://my.url',seq), now(), seq, seq, seq, seq, seq, concat('script',seq), seq from seq_1_to_2000000 So, while INSERTs are similar, they produce essentially different data layouts. The tests were run on tmpdirs with 3 different free space volume: 16 Gb, 8 Gb, 300 Mb. On 16 Gb, in both cases ANALYZE passes. At most it takes ~53% of tmpdir space, so it's slightly above 8 Gb. On 8 Gb and 300 Mb, it fails with the same infamous 64 disk space errors. Execution time:   16 Gb 8 Gb 300 Mb INSERT1 332.07 sec 206.35 sec 920.29 sec INSERT2 314.79 sec 172.25 sec 2187.84 sec I believe the difference on 8 Gb can be ignored, it might well be a random fluctuation. The difference on 300 Mb is noticeable though. – both the horizontal and vertical difference. It is not random, I double-checked.
              Hide
              psergey Sergei Petrunia added a comment -

              Motivated by CSC#7897, I've tried to see how much space can ANALYZE use. (So far my understanding was that max. temp dir usage would be rougly equal to size of the table w/o indexes).

              Let's try this:

              create table t1 (pk int primary key, col1 varchar(100)) charset=utf8;
              insert into t1 select A.a+B.a*1000, concat('val-', A.a+B.a*1000) from one_k A, one_k B;
              

              The table has 1M rows. Rows in the VARCHAR(100) column occupy less than
              300 bytes (which is typical).

              The table on disk:

               -rw-rw---- 1 psergey psergey  44M Aug  1 23:09 t1.ibd
              

              Now,

              analyze table t1 persistent for all;
              
                Breakpoint 4, unique_write_to_file_with_count (key=0x7ffebf2ce0a0 "\005", count=1, unique=0x7ffebf047ef0) at /home/psergey/dev2/10.0/sql/uniques.cc:53
              (gdb) print key
                $7 = (uchar *) 0x7ffebf2ce0a0 "\005"
              (gdb)  
              (gdb) p unique->size
                $8 = 302
              

              It's writing 300 bytes, the unpacked length.

              In total, we get:

                Breakpoint 3, Unique::flush (this=0x7ffebf047ef0) at /home/psergey/dev2/10.0/sql/uniques.cc:379
                $46 = 281,734,200
                Breakpoint 3, Unique::flush (this=0x7ffebf047ef0) at /home/psergey/dev2/10.0/sql/uniques.cc:379
                $48 = 297,386,100
              ...
                Breakpoint 3, Unique::flush (this=0x7ffebf047a30) at /home/psergey/dev2/10.0/sql/uniques.cc:379
                $47 = 4,194,304
              

              300MBit = 286 M, i.e. ANALYZE may require much more space than is occupied by the table (44M)

              Show
              psergey Sergei Petrunia added a comment - Motivated by CSC#7897, I've tried to see how much space can ANALYZE use. (So far my understanding was that max. temp dir usage would be rougly equal to size of the table w/o indexes). Let's try this: create table t1 (pk int primary key, col1 varchar(100)) charset=utf8; insert into t1 select A.a+B.a*1000, concat('val-', A.a+B.a*1000) from one_k A, one_k B; The table has 1M rows. Rows in the VARCHAR(100) column occupy less than 300 bytes (which is typical). The table on disk: -rw-rw---- 1 psergey psergey 44M Aug 1 23:09 t1.ibd Now, analyze table t1 persistent for all; Breakpoint 4, unique_write_to_file_with_count (key=0x7ffebf2ce0a0 "\005", count=1, unique=0x7ffebf047ef0) at /home/psergey/dev2/10.0/sql/uniques.cc:53 (gdb) print key $7 = (uchar *) 0x7ffebf2ce0a0 "\005" (gdb) (gdb) p unique->size $8 = 302 It's writing 300 bytes, the unpacked length. In total, we get: Breakpoint 3, Unique::flush (this=0x7ffebf047ef0) at /home/psergey/dev2/10.0/sql/uniques.cc:379 $46 = 281,734,200 Breakpoint 3, Unique::flush (this=0x7ffebf047ef0) at /home/psergey/dev2/10.0/sql/uniques.cc:379 $48 = 297,386,100 ... Breakpoint 3, Unique::flush (this=0x7ffebf047a30) at /home/psergey/dev2/10.0/sql/uniques.cc:379 $47 = 4,194,304 300MBit = 286 M, i.e. ANALYZE may require much more space than is occupied by the table (44M)
              Hide
              psergey Sergei Petrunia added a comment - - edited

              If we can't quickly fix ANALYZE to take less diskspace, we could have it refuse to run if there is a risk of running out of disk space. The thing is, current way ANALYZE uses diskspace is inefficient, but it is very predictable.

              POSIX.1-2001 provides a function to check free disk space, statvfs. TokuDB uses is. It seems, my_XXXX portability layer doesn't have a function to check for free disk space.

              Show
              psergey Sergei Petrunia added a comment - - edited If we can't quickly fix ANALYZE to take less diskspace, we could have it refuse to run if there is a risk of running out of disk space. The thing is, current way ANALYZE uses diskspace is inefficient, but it is very predictable. POSIX.1-2001 provides a function to check free disk space, statvfs. TokuDB uses is. It seems, my_XXXX portability layer doesn't have a function to check for free disk space.
              Hide
              serg Sergei Golubchik added a comment - - edited

              The "EITS could eat all tmpdir space" is not fixed, the "hang" is.

              Show
              serg Sergei Golubchik added a comment - - edited The "EITS could eat all tmpdir space" is not fixed, the "hang" is.

                People

                • Assignee:
                  serg Sergei Golubchik
                  Reporter:
                  jocel1 jocelyn fournier
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  5 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved:

                    Time Tracking

                    Estimated:
                    Original Estimate - Not Specified
                    Not Specified
                    Remaining:
                    Remaining Estimate - 0 minutes
                    0m
                    Logged:
                    Time Spent - 1 hour
                    1h