Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-678

LP:881383 - Corrupt ARCHIVE tables failing to repair

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 5.1.67, 5.2.14, 5.3.12, 5.5.36, 10.0.9
    • Fix Version/s: 10.0, 5.5
    • Component/s: None

      Description

      We regularly have corrupt ARCHIVE engine tables that fail to repair:
      ================================

      MariaDB [(none)]> database;
      Database changed
      MariaDB [database]> repair table table;
      +-----------------------------+--------+----------+----------+
      | Table                       | Op     | Msg_type | Msg_text |
      +-----------------------------+--------+----------+----------+
      | database.table | repair | error    | Corrupt  |
      +-----------------------------+--------+----------+----------+
      1 row in set (3.92 sec)
      
      MariaDB [database]>
      

      ================================

      I'm unsure what causes the corruption, but a repair shouldn't be failing I presume?

      I'll upload a copy of an example table to FTP.

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            serg Sergei Golubchik added a comment -

            Re: Corrupt ARCHIVE tables failing to repair
            on irc:
            <dreas> Basically if you kill MariaDB whilst it's writing to an archive table, you can throw it away afterwards

            Show
            serg Sergei Golubchik added a comment - Re: Corrupt ARCHIVE tables failing to repair on irc: <dreas> Basically if you kill MariaDB whilst it's writing to an archive table, you can throw it away afterwards
            Hide
            elenst Elena Stepanova added a comment -

            Re: Corrupt ARCHIVE tables failing to repair
            Hi Dreas,

            There seems to be a workaround which allows to fix the corrupted tables: run REPAIR TABLE <tablename> EXTENDED twice on the table. See the output below (`t` it's pretty much a copy of one of your tables).

            Both 'EXTENDED' and 'twice' is important. The second attempt of REPAIR .. EXTENDED takes much longer, and after that the table starts being usable again. I've seen the same effect on other tables you uploaded. Please try it out.

            MariaDB [test]> check table t;
            -----------------------------+

            Table Op Msg_type Msg_text

            -----------------------------+

            test.t check error Corrupt

            -----------------------------+
            1 row in set (11.52 sec)

            MariaDB [test]> repair table t;
            ------------------------------+

            Table Op Msg_type Msg_text

            ------------------------------+

            test.t repair error Corrupt

            ------------------------------+
            1 row in set (10.51 sec)

            MariaDB [test]> check table t;
            -----------------------------+

            Table Op Msg_type Msg_text

            -----------------------------+

            test.t check error Corrupt

            -----------------------------+
            1 row in set (1.65 sec)

            MariaDB [test]> insert into t select * from corrupt_archive_tables.t1;
            ERROR 1034 (HY000): Incorrect key file for table 't'; try to repair it

            MariaDB [test]> repair table t extended;
            ------------------------------+

            Table Op Msg_type Msg_text

            ------------------------------+

            test.t repair status OK

            ------------------------------+
            1 row in set (9.88 sec)

            MariaDB [test]> check table t;
            -----------------------------+

            Table Op Msg_type Msg_text

            -----------------------------+

            test.t check error Corrupt

            -----------------------------+
            1 row in set (6.59 sec)

            MariaDB [test]> repair table t extended;
            ------------------------------+

            Table Op Msg_type Msg_text

            ------------------------------+

            test.t repair status OK

            ------------------------------+
            1 row in set (40.35 sec)

            MariaDB [test]> check table t;
            -----------------------------+

            Table Op Msg_type Msg_text

            -----------------------------+

            test.t check status OK

            -----------------------------+
            1 row in set (6.60 sec)

            MariaDB [test]> insert into t select * from corrupt_archive_tables.t1;
            Query OK, 700000 rows affected (23.28 sec)
            Records: 700000 Duplicates: 0 Warnings: 0

            Show
            elenst Elena Stepanova added a comment - Re: Corrupt ARCHIVE tables failing to repair Hi Dreas, There seems to be a workaround which allows to fix the corrupted tables: run REPAIR TABLE <tablename> EXTENDED twice on the table. See the output below (`t` it's pretty much a copy of one of your tables). Both 'EXTENDED' and 'twice' is important. The second attempt of REPAIR .. EXTENDED takes much longer, and after that the table starts being usable again. I've seen the same effect on other tables you uploaded. Please try it out. MariaDB [test] > check table t; ------- ----- -------- ---------+ Table Op Msg_type Msg_text ------- ----- -------- ---------+ test.t check error Corrupt ------- ----- -------- ---------+ 1 row in set (11.52 sec) MariaDB [test] > repair table t; ------- ------ -------- ---------+ Table Op Msg_type Msg_text ------- ------ -------- ---------+ test.t repair error Corrupt ------- ------ -------- ---------+ 1 row in set (10.51 sec) MariaDB [test] > check table t; ------- ----- -------- ---------+ Table Op Msg_type Msg_text ------- ----- -------- ---------+ test.t check error Corrupt ------- ----- -------- ---------+ 1 row in set (1.65 sec) MariaDB [test] > insert into t select * from corrupt_archive_tables.t1; ERROR 1034 (HY000): Incorrect key file for table 't'; try to repair it MariaDB [test] > repair table t extended; ------- ------ -------- ---------+ Table Op Msg_type Msg_text ------- ------ -------- ---------+ test.t repair status OK ------- ------ -------- ---------+ 1 row in set (9.88 sec) MariaDB [test] > check table t; ------- ----- -------- ---------+ Table Op Msg_type Msg_text ------- ----- -------- ---------+ test.t check error Corrupt ------- ----- -------- ---------+ 1 row in set (6.59 sec) MariaDB [test] > repair table t extended; ------- ------ -------- ---------+ Table Op Msg_type Msg_text ------- ------ -------- ---------+ test.t repair status OK ------- ------ -------- ---------+ 1 row in set (40.35 sec) MariaDB [test] > check table t; ------- ----- -------- ---------+ Table Op Msg_type Msg_text ------- ----- -------- ---------+ test.t check status OK ------- ----- -------- ---------+ 1 row in set (6.60 sec) MariaDB [test] > insert into t select * from corrupt_archive_tables.t1; Query OK, 700000 rows affected (23.28 sec) Records: 700000 Duplicates: 0 Warnings: 0
            Hide
            elenst Elena Stepanova added a comment - - edited

            Re: Corrupt ARCHIVE tables failing to repair
            Hi Monty,

            from IRC:
            montywi: if you could just do a quick test of trying to repair an archive table that is 'too short' and check what happens...
            montywi: ie, does it abort repair or does it get back most of the rows

            One observation that seems to be an error rather than a limitation (I already mentioned it in the previous comment, will put in more details now, and with test SQL to reproduce instead of tables uploaded to our FTP).

            After an archive table gets corrupted due to server being killed while writing into the table, REPAIR TABLE does not help, it leaves the table in the same corrupted state, no matter how many times we try. However, REPAIR TABLE .. EXTENDED does help, but only if it's executed twice.
            I don't know why REPAIR does not work but EXTENDED does, much less why it requires two iterations, but it doesn't look right.
            Noticeably, the second REPAIR..EXTENDED iteration takes much longer, so something definitely goes on there.

            In the manual test below steps are described as comments, and actual SQL as commands, so you can copy-paste it into your client in two chunks (before and after server restart).

            I tried MariaDB 5.2.10 and MariaDB 5.5.21, got the same results on both.
            On MySQL 5.5.21, however, the first REPAIR (normal, not EXTENDED) returns OK, and all consequent commands too.

            1. Scenario to reproduce:
            # start server, no specific parameters needed;
            # create and populate a MyISAM table which we'll be reading from, as below: 
            
            # BEFORE server restart
            
            DROP TABLE IF EXISTS t1, t2;
            CREATE TABLE t1 ( f1 CHAR(16) ) ENGINE=MyISAM;
            INSERT INTO t1 VALUES ('abc'),('def'),('ghi'),('jkl');
            
            # create and start populating an archive table, as below:
            CREATE TABLE t2 ( f1 CHAR(16) ) ENGINE=ARCHIVE;
            INSERT INTO t2 SELECT a.* FROM t1 a, t1 b, t1 c, t1 d, t1 e, t1 f, t1 g, t1 h, t1 i, t1 j, t1 k, t1 l, t1 m;
            
            # End of "BEFORE server restart"
            
            ##########################
            # while the insert is running, kill the server with kill -9
            # (we insert ~65M rows, so there will be time)
            # start the server again (no complaints in error log about archive tables)
            ##########################
            
            # AFTER server restart:
            
            CHECK TABLE t2;
            # It says the table is corrupted.
            # You can run it several times if you wish, it's still corrupted.
            
            REPAIR TABLE t2;
            # Again, it says the table is corrupted, 
            # and you can run it several times, it's still corrupted.
            
            OPTIMIZE TABLE t2;
            # It says "Unknown - internal error 145 during operation"
            
            # 145 is "table marked as crashed", so it's not wrong. 
            # Running it several times doesn't help, either.
            
            REPAIR TABLE t2 EXTENDED;
            # It says OK, but if you run CHECK TABLE now, it's still corrupted:
            
            CHECK TABLE t2;
            # (says "corrupt")
            
            # Run repair .. .extended again (takes much longer):
            
            REPAIR TABLE t2 EXTENDED;
            # (says "OK")
            
            # And now CHECK TABLE suddenly says it's OK, too:
            
            CHECK TABLE t2;
            # (says OK)
            
            # And it's usable again:
            
            SELECT COUNT(*) FROM t2;
            # Returns some count>0;
            
            SELECT * FROM t2 LIMIT 1;
            # Returns a row
            
            Show
            elenst Elena Stepanova added a comment - - edited Re: Corrupt ARCHIVE tables failing to repair Hi Monty, from IRC: montywi: if you could just do a quick test of trying to repair an archive table that is 'too short' and check what happens... montywi: ie, does it abort repair or does it get back most of the rows One observation that seems to be an error rather than a limitation (I already mentioned it in the previous comment, will put in more details now, and with test SQL to reproduce instead of tables uploaded to our FTP). After an archive table gets corrupted due to server being killed while writing into the table, REPAIR TABLE does not help, it leaves the table in the same corrupted state, no matter how many times we try. However, REPAIR TABLE .. EXTENDED does help, but only if it's executed twice. I don't know why REPAIR does not work but EXTENDED does, much less why it requires two iterations, but it doesn't look right. Noticeably, the second REPAIR..EXTENDED iteration takes much longer, so something definitely goes on there. In the manual test below steps are described as comments, and actual SQL as commands, so you can copy-paste it into your client in two chunks (before and after server restart). I tried MariaDB 5.2.10 and MariaDB 5.5.21, got the same results on both. On MySQL 5.5.21, however, the first REPAIR (normal, not EXTENDED) returns OK, and all consequent commands too. Scenario to reproduce: # start server, no specific parameters needed; # create and populate a MyISAM table which we'll be reading from , as below: # BEFORE server restart DROP TABLE IF EXISTS t1, t2; CREATE TABLE t1 ( f1 CHAR(16) ) ENGINE=MyISAM; INSERT INTO t1 VALUES ('abc'),('def'),('ghi'),('jkl'); # create and start populating an archive table, as below: CREATE TABLE t2 ( f1 CHAR(16) ) ENGINE=ARCHIVE; INSERT INTO t2 SELECT a.* FROM t1 a, t1 b, t1 c, t1 d, t1 e, t1 f, t1 g, t1 h, t1 i, t1 j, t1 k, t1 l, t1 m; # End of "BEFORE server restart" ########################## # while the insert is running, kill the server with kill -9 # (we insert ~65M rows, so there will be time) # start the server again (no complaints in error log about archive tables) ########################## # AFTER server restart: CHECK TABLE t2; # It says the table is corrupted. # You can run it several times if you wish, it's still corrupted. REPAIR TABLE t2; # Again, it says the table is corrupted, # and you can run it several times, it's still corrupted. OPTIMIZE TABLE t2; # It says "Unknown - internal error 145 during operation" # 145 is "table marked as crashed" , so it's not wrong. # Running it several times doesn't help, either. REPAIR TABLE t2 EXTENDED; # It says OK, but if you run CHECK TABLE now, it's still corrupted: CHECK TABLE t2; # (says "corrupt" ) # Run repair .. .extended again (takes much longer): REPAIR TABLE t2 EXTENDED; # (says "OK" ) # And now CHECK TABLE suddenly says it's OK, too: CHECK TABLE t2; # (says OK) # And it's usable again: SELECT COUNT(*) FROM t2; # Returns some count>0; SELECT * FROM t2 LIMIT 1; # Returns a row
            Hide
            ratzpo Rasmus Johansson added a comment -

            Launchpad bug id: 881383

            Show
            ratzpo Rasmus Johansson added a comment - Launchpad bug id: 881383
            Hide
            elenst Elena Stepanova added a comment -

            Also attached an MTR testcase mdev678.test, but depending on the version, sleep_time value (the first line) should be tuned. It's the time between issuing INSERT and killing the server, it needs to be adjusted so that some rows get already inserted into the table, but INSERT is not finished yet.

            Could also reproduce the problem on MySQL 5.1/5.5, but not on MySQL 5.6 – it just loses whatever was inserted, but technically REPAIR works.

            Show
            elenst Elena Stepanova added a comment - Also attached an MTR testcase mdev678.test, but depending on the version, sleep_time value (the first line) should be tuned. It's the time between issuing INSERT and killing the server, it needs to be adjusted so that some rows get already inserted into the table, but INSERT is not finished yet. Could also reproduce the problem on MySQL 5.1/5.5, but not on MySQL 5.6 – it just loses whatever was inserted, but technically REPAIR works.

              People

              • Assignee:
                monty Michael Widenius
                Reporter:
                dreasvandonselaar Dreas van Donselaar
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated: