Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-7139

Sporadic failure in innodb.innodb_corrupt_bit on P8

    Details

      Description

      http://buildbot.askmonty.org/buildbot/builders/bintar-rhel7-p8-debug/builds/12/steps/test/logs/stdio

      innodb.innodb_corrupt_bit 'innodb_plugin' w1 [ fail ]
              Test ended at 2014-11-19 05:33:19
      
      CURRENT_TEST: innodb.innodb_corrupt_bit
      mysqltest: At line 74: query 'INSERT INTO corrupt_bit_test_ā VALUES(13000,'x',1,1)' failed: 1062: Duplicate entry '1-x' for key 'idxē'
      
      The result from queries just before the failure was:
      < snip >
      @@unique_checks
      0
      select @@innodb_change_buffering_debug;
      @@innodb_change_buffering_debug
      1
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+1,z+1 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+10,z+10 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+20,z+20 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+50,z+50 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+100,z+100 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+200,z+200 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+400,z+400 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+800,z+800 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+1600,z+1600 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+4000,z+4000 FROM corrupt_bit_test_ā;
      select count(*) from corrupt_bit_test_ā;
      count(*)
      1024
      CREATE INDEX idx3 ON corrupt_bit_test_ā(b, c);
      INSERT INTO corrupt_bit_test_ā VALUES(13000,'x',1,1);
      
      More results from queries before failure can be found in /home/buildbot/maria-slave/power8-vlp03-bintar-debug/build/mysql-test/var/1/log/innodb_corrupt_bit.log
      

      So far it happened twice, one time above on 10.0 / bintar-rhel7-p8-debug, and another one on 5.5 / bintar-trusty-p8-debug. Apparently there are no logs from the other one, but it can be seen in Cross Reference report:

      [Build Details]	5.5	4339	bintar-trusty-p8-debug	2014-10-31 15:00:29	172	nm	Normal run, no --ps-protocol	innodb.innodb_corrupt_bit	xtradb	
      innodb.innodb_corrupt_bit 'xtradb'       w4 [ fail ]
              Test ended at 2014-10-31 09:47:07
      
      CURRENT_TEST: innodb.innodb_corrupt_bit
      mysqltest: At line 72: query 'INSERT INTO corrupt_bit_test_ā VALUES(13000,'x',1,1)' failed: 1062: Duplicate entry '1-x' for key 'idxā'
      
      The result from queries just before the failure was:
      < snip >
      @@unique_checks
      0
      select @@innodb_change_buffering_debug;
      @@innodb_change_buffering_debug
      1
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+1,z+1 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+10,z+10 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+20,z+20 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+50,z+50 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+100,z+100 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+200,z+200 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+400,z+400 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+800,z+800 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+1600,z+1600 FROM corrupt_bit_test_ā;
      INSERT INTO corrupt_bit_test_ā SELECT 0,b,c+4000,z+4000 FROM corrupt_bit_test_ā;
      select count(*) from corrupt_bit_test_ā;
      count(*)
      1024
      CREATE INDEX idx3 ON corrupt_bit_test_ā(b, c);
      INSERT INTO corrupt_bit_test_ā VALUES(13000,'x',1,1);
      
      More results from queries before failure can be found in /var/lib/buildbot/maria-slave/power8-vlp04-debug/build/mysql-test/var/4/log/innodb_corrupt_bit.log
      

      I find it highly suspicious that both times the failure happened on P8. It's a plain test, there should be no race conditions involved, so there must be some other reason (build problems? environmental?). In any case, it's worth investigating.

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

              Hide
              svoj Sergey Vojtovich added a comment - - edited

              The problem is reproducible on x86 using 5.5/r4376. But it takes hours to fail:

              while [ $? -eq 0 ]; do ./mtr --parallel=2 innodb.innodb_corrupt_bit --repeat=100 --mem; done
              

              On PPC64 the problem is a bit easier to reproduce. 5.5/r4421 fails within an hour with the following simplified test:

              --source include/have_innodb.inc
              --source include/have_debug.inc
              
              SET GLOBAL innodb_change_buffering_debug=1;
              SET unique_checks=0;
              
              CREATE TABLE t1(
                     a INT AUTO_INCREMENT PRIMARY KEY,
                     b CHAR(100),
                     c INT,
                     z INT,
                     UNIQUE(z, b))
              ENGINE=InnoDB;
              
              INSERT INTO t1 VALUES(0,'x',1, 1);
              INSERT INTO t1 SELECT 0,b,c+1,z+1 FROM t1;
              INSERT INTO t1 SELECT 0,b,c+10,z+10 FROM t1;
              INSERT INTO t1 SELECT 0,b,c+20,z+20 FROM t1;
              INSERT INTO t1 SELECT 0,b,c+50,z+50 FROM t1;
              INSERT INTO t1 SELECT 0,b,c+100,z+100 FROM t1;
              INSERT INTO t1 SELECT 0,b,c+200,z+200 FROM t1;
              INSERT INTO t1 SELECT 0,b,c+400,z+400 FROM t1;
              INSERT INTO t1 SELECT 0,b,c+800,z+800 FROM t1;
              INSERT INTO t1 SELECT 0,b,c+1600,z+1600 FROM t1;
              INSERT INTO t1 SELECT 0,b,c+4000,z+4000 FROM t1;
              INSERT INTO t1 VALUES(13000,'x',1,1);
              
              DROP TABLE t1;
              SET GLOBAL innodb_change_buffering_debug=0;
              

              The problem seem to be in guts of InnoDB btree implementation.

              Show
              svoj Sergey Vojtovich added a comment - - edited The problem is reproducible on x86 using 5.5/r4376. But it takes hours to fail: while [ $? -eq 0 ]; do ./mtr --parallel=2 innodb.innodb_corrupt_bit --repeat=100 --mem; done On PPC64 the problem is a bit easier to reproduce. 5.5/r4421 fails within an hour with the following simplified test: --source include/have_innodb.inc --source include/have_debug.inc SET GLOBAL innodb_change_buffering_debug=1; SET unique_checks=0; CREATE TABLE t1( a INT AUTO_INCREMENT PRIMARY KEY, b CHAR(100), c INT, z INT, UNIQUE(z, b)) ENGINE=InnoDB; INSERT INTO t1 VALUES(0,'x',1, 1); INSERT INTO t1 SELECT 0,b,c+1,z+1 FROM t1; INSERT INTO t1 SELECT 0,b,c+10,z+10 FROM t1; INSERT INTO t1 SELECT 0,b,c+20,z+20 FROM t1; INSERT INTO t1 SELECT 0,b,c+50,z+50 FROM t1; INSERT INTO t1 SELECT 0,b,c+100,z+100 FROM t1; INSERT INTO t1 SELECT 0,b,c+200,z+200 FROM t1; INSERT INTO t1 SELECT 0,b,c+400,z+400 FROM t1; INSERT INTO t1 SELECT 0,b,c+800,z+800 FROM t1; INSERT INTO t1 SELECT 0,b,c+1600,z+1600 FROM t1; INSERT INTO t1 SELECT 0,b,c+4000,z+4000 FROM t1; INSERT INTO t1 VALUES(13000,'x',1,1); DROP TABLE t1; SET GLOBAL innodb_change_buffering_debug=0; The problem seem to be in guts of InnoDB btree implementation.
              Hide
              jplindst Jan Lindström added a comment -

              Reproducible with 5.5 up to revision 4423 using my laptop:

              CURRENT_TEST: innodb.innodb_corrupt_bit
              mysqltest: At line 83: query 'INSERT INTO corrupt_bit_test_ā VALUES(13000,'x',1,1)' failed: 1062: Duplicate entry '1-x' for key 'idxē'
              
              Show
              jplindst Jan Lindström added a comment - Reproducible with 5.5 up to revision 4423 using my laptop: CURRENT_TEST: innodb.innodb_corrupt_bit mysqltest: At line 83: query 'INSERT INTO corrupt_bit_test_ā VALUES(13000,'x',1,1)' failed: 1062: Duplicate entry '1-x' for key 'idxē'
              Hide
              jplindst Jan Lindström added a comment -

              I can repeat this only if I use debug build and innodb_change_buffering_debug=1; Test case intentionally tryes to insert duplicate key entry to index and still continue, there is some bug on insert buffering, not sure if that is visible only when this debug option is set.

              Show
              jplindst Jan Lindström added a comment - I can repeat this only if I use debug build and innodb_change_buffering_debug=1; Test case intentionally tryes to insert duplicate key entry to index and still continue, there is some bug on insert buffering, not sure if that is visible only when this debug option is set.
              Hide
              jplindst Jan Lindström added a comment -

              revno: 4424
              committer: Jan Lindström <jplindst@mariadb.org>
              branch nick: 5.5
              timestamp: Mon 2015-02-09 16:14:27 +0200
              message:
              MDEV-7139: Sporadic failure in innodb.innodb_corrupt_bit on P8

              The testcase fails randomly due to ibuf merge happening in the background.
              Fix not to do any merges with ibuf_debug enabled.

              Show
              jplindst Jan Lindström added a comment - revno: 4424 committer: Jan Lindström <jplindst@mariadb.org> branch nick: 5.5 timestamp: Mon 2015-02-09 16:14:27 +0200 message: MDEV-7139 : Sporadic failure in innodb.innodb_corrupt_bit on P8 The testcase fails randomly due to ibuf merge happening in the background. Fix not to do any merges with ibuf_debug enabled.

                People

                • Assignee:
                  jplindst Jan Lindström
                  Reporter:
                  elenst Elena Stepanova
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  3 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved: