We're updating the issue view to help you get more done. 

Strange behaviour of OPTIMIZE TABLE with TokuDB [2x table size increase after usage]

Description

Hi.

I am using MariaDB for 2 years. I moved to MariaDB from Percona MySQL edition. Currently i am using it on 100+ production servers, and many of them are highloaded servers with 50M rows tables or more. Optimization is very important to me. At the moment i am using TokuDB as the primary database type.
I've got one strange behaviour with TokuDB. The strange thing is with OPTIMIZE TABLE. Shortly - it is related to database size. It got increased 2x after OPTIMIZE TABLE. But it's very very strange. Let me show you what's strange here with test. Please sorry myself for such a bad format of message and bad English. I am just tried to show this test step by step.

For test i am using 10.0.10-MariaDB-1~wheezy-log version of MySQL on Debian Wheezy 7.5 64-bit. MySQL settings is not important here, it's near default. TokuDB settings is here:

1 2 3 4 5 6 7 8 plugin-load=ha_tokudb tokudb_commit_sync=0 tokudb_fsync_log_period=1000 tokudb_lock_timeout=0 tokudb_write_status_frequency=10000000 tokudb_read_status_frequency=100000000 tokudb_fs_reserve_percent=1 tokudb_cache_size=14000M

For the test i've copy one of my production databases to new database "tester_testdb" with one table called "test_table" inside this database. This table has near 1 million rows and 2 keys. 1 - primary (id). 2 - pub_key(pub_key). It's MyISAM table.

An initial file size of this table is 110 megabytes.

1 110M tester_testdb

Structure:

1 2 3 4 5 6 CREATE TABLE IF NOT EXISTS `test_table` ( `id` int(10) unsigned NOT NULL, `pub_key` varchar(80) NOT NULL, PRIMARY KEY (`id`), KEY `pub_key` (`pub_key`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1;

Just to be sure, i did OPTIMIZE TABLE on this MyISAM table. Size did not changed.

Then, let's convert this table to TokuDB. We'll use default zlib compression.

1 ALTER TABLE test_table ENGINE=TokuDB COMPRESSION=TokuDB_zlib
1 2 3 4 47M _tester_testdb_sql_61e7_1812_main_ad88a6b_1_19_B_0.tokudb 45M _tester_testdb_sql_61e7_1812_key_pub_key_ad88a6b_1_19_B_1.tokudb 64K _tester_testdb_sql_61e7_1812_status_ad88a69_1_19.tokudb 12K tester_testdb

All seems to be OK here. We have near the same table size - summary ~93 mbytes.

Now, let's do optimize.

OPTIMIZE TABLE test_table

Look at the file system again.

1 2 3 4 63M _tester_testdb_sql_61e7_1812_main_ad88a6b_1_19_B_0.tokudb 61M _tester_testdb_sql_61e7_1812_key_pub_key_ad88a6b_1_19_B_1.tokudb 64K _tester_testdb_sql_61e7_1812_status_ad88a69_1_19.tokudb 12K tester_testdb

Hmm? Table size is now increased to ~125 mbytes. Why???

Let's do optimize again.

OPTIMIZE TABLE test_table

1 2 3 4 79M _tester_testdb_sql_61e7_1812_main_ad88a6b_1_19_B_0.tokudb 61M _tester_testdb_sql_61e7_1812_key_pub_key_ad88a6b_1_19_B_1.tokudb 64K _tester_testdb_sql_61e7_1812_status_ad88a69_1_19.tokudb 12K tester_testdb

Wow... we got another increase in table size. Why???

Let's do optimize again.

OPTIMIZE TABLE test_table

1 2 3 4 79M _tester_testdb_sql_61e7_1812_main_ad88a6b_1_19_B_0.tokudb 61M _tester_testdb_sql_61e7_1812_key_pub_key_ad88a6b_1_19_B_1.tokudb 64K _tester_testdb_sql_61e7_1812_status_ad88a69_1_19.tokudb 12K tester_testdb

Ok. We reach the final destination. Database size should be stable now.

Let's then convert our secondary index to clustering index.

ALTER TABLE test_table DROP INDEX pub_key

1 2 3 79M _tester_testdb_sql_61e7_1812_main_ad88a6b_1_19_B_0.tokudb 64K _tester_testdb_sql_61e7_1812_status_ad88a69_1_19.tokudb 12K tester_testdb

Ok. Second step.

ALTER TABLE test_table ADD INDEX pub_key(pub_key) CLUSTERING=Yes

1 2 3 4 79M _tester_testdb_sql_61e7_1812_main_ad88a6b_1_19_B_0.tokudb 45M _tester_testdb_test_table_key_pub_key_ad9ba7c_3_19_B_0.tokudb 64K _tester_testdb_sql_61e7_1812_status_ad88a69_1_19.tokudb 12K tester_testdb

All seems to be OK here. We have clustering index.

Let's do optimize.

OPTIMIZE TABLE test_table

1 2 3 4 79M _tester_testdb_sql_61e7_1812_main_ad88a6b_1_19_B_0.tokudb 61M _tester_testdb_test_table_key_pub_key_ad9ba7c_3_19_B_0.tokudb 64K _tester_testdb_sql_61e7_1812_status_ad88a69_1_19.tokudb 12K tester_testdb

Now it returns to higher value once again. So now the total size of our database and table is ~140 mbytes. But...with MyISAM we have 110 mbytes. That's the problem? I think, that here's something wrong with OPTIMIZE TABLE. I think so, because i can have more compacted table with just one more command.

ALTER TABLE test_table ENGINE=TokuDB COMPRESSION=TokuDB_zlib

1 2 3 4 33M _tester_testdb_sql_61e7_2102_key_pub_key_ada5337_3_19.tokudb 32M _tester_testdb_sql_61e7_2102_main_ada5337_2_19.tokudb 32K _tester_testdb_sql_61e7_2102_status_ada5337_1_19.tokudb 12K tester_testdb

Here i've just altered table once again. And table size dropped ~2x. That's really strange. This (~66 mbytes) should be the final and correct table size. But after table optimize i got 2x increase in table size and it's very very bad.

Please, try to repeat my experiment yourself and review is there any problem with OPTIMIZE TABLE command on TokuDB tables or not.

Thanks in advance for your time,

Alex.

Environment

Debian Wheezy 7.5 64-bit

Status

Assignee

Rich Prohaska

Reporter

ProfforgN

Labels

Fix versions

Affects versions

10.0.10

Priority

Major