Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-4322

Broken XID counting during binlog rotation

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 10.0.1
    • Fix Version/s: 10.0.2
    • Component/s: None
    • Labels:

      Description

      It looks like rotation of binlogs involves some broken logic involving . After binlog rotation MYSQL_BIN_LOG::do_checkpoint_request() is called. It calls ha_commit_checkpoint_request(), and from there for each engine that has commit_checkpoint_request function binlog_checkpoint_callback() is called. There mysql_bin_log.mark_xids_active() is called and corresponding mark_xid_done() is not called from anywhere.

      I don't understand why this problem is not exposed anywhere in the test suite but I was able to hit it like this:

      cmake . -DCMAKE_BUILD_TYPE=Debug
      make
      cd mysql-test/
      echo "--innodb" >> suite/sys_vars/t/rpl_max_binlog_size_func-master.opt
      ./mtr sys_vars.rpl_max_binlog_size_func

      After that I get "void MYSQL_BIN_LOG::cleanup(): Assertion `b->xid_count == 0' failed" during shutdown after the test.

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            elenst Elena Stepanova added a comment -

            perl ./mtr sys_vars.rpl_max_binlog_size_func -mysqld=-innodb

            #7 0x00007fc5c1010192 in _GI__assert_fail (assertion=0xe4eaa4 "b->xid_count == 0", file=0xe4dd48 "/data/bzr/10.0/sql/log.cc", line=2979, function=0xe51d90 "void MYSQL_BIN_LOG::cleanup()") at assert.c:103
            #8 0x00000000008a1bb3 in MYSQL_BIN_LOG::cleanup (this=0x15d8220) at /data/bzr/10.0/sql/log.cc:2979
            #9 0x0000000000563f23 in clean_up (print_message=true) at /data/bzr/10.0/sql/mysqld.cc:1906
            #10 0x0000000000563d42 in unireg_end () at /data/bzr/10.0/sql/mysqld.cc:1837
            #11 0x0000000000563c66 in kill_server (sig_ptr=0x0) at /data/bzr/10.0/sql/mysqld.cc:1765
            #12 0x0000000000563c81 in kill_server_thread (arg=0x7fc5b9424e48) at /data/bzr/10.0/sql/mysqld.cc:1788
            #13 0x0000000000cd9f64 in pfs_spawn_thread (arg=0x7fc588001340) at /data/bzr/10.0/storage/perfschema/pfs.cc:1800
            #14 0x00007fc5c1de0e9a in start_thread (arg=0x7fc5b9393700) at pthread_create.c:308
            #15 0x00007fc5c10d4cbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
            #16 0x0000000000000000 in ?? ()

            Show
            elenst Elena Stepanova added a comment - perl ./mtr sys_vars.rpl_max_binlog_size_func - mysqld= -innodb #7 0x00007fc5c1010192 in _ GI __assert_fail (assertion=0xe4eaa4 "b->xid_count == 0", file=0xe4dd48 "/data/bzr/10.0/sql/log.cc", line=2979, function=0xe51d90 "void MYSQL_BIN_LOG::cleanup()") at assert.c:103 #8 0x00000000008a1bb3 in MYSQL_BIN_LOG::cleanup (this=0x15d8220) at /data/bzr/10.0/sql/log.cc:2979 #9 0x0000000000563f23 in clean_up (print_message=true) at /data/bzr/10.0/sql/mysqld.cc:1906 #10 0x0000000000563d42 in unireg_end () at /data/bzr/10.0/sql/mysqld.cc:1837 #11 0x0000000000563c66 in kill_server (sig_ptr=0x0) at /data/bzr/10.0/sql/mysqld.cc:1765 #12 0x0000000000563c81 in kill_server_thread (arg=0x7fc5b9424e48) at /data/bzr/10.0/sql/mysqld.cc:1788 #13 0x0000000000cd9f64 in pfs_spawn_thread (arg=0x7fc588001340) at /data/bzr/10.0/storage/perfschema/pfs.cc:1800 #14 0x00007fc5c1de0e9a in start_thread (arg=0x7fc5b9393700) at pthread_create.c:308 #15 0x00007fc5c10d4cbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #16 0x0000000000000000 in ?? ()
            Hide
            knielsen Kristian Nielsen added a comment -

            Thanks for catching this!

            > There mysql_bin_log.mark_xids_active() is called and corresponding
            > mark_xid_done() is not called from anywhere.

            It is called from binlog_background_thread(). This happens asynchroneously, in
            the background.

            I suspect the problem here is that we need to wait for the binlog background
            thread to have time to process any pending requests before we do binlog
            cleanup. At least if I add this at the end of the test case, it does not
            assert:

            --source include/wait_for_binlog_checkpoint.inc

            I'll take a closer look next week and find out what the real solution should
            be.

            Show
            knielsen Kristian Nielsen added a comment - Thanks for catching this! > There mysql_bin_log.mark_xids_active() is called and corresponding > mark_xid_done() is not called from anywhere. It is called from binlog_background_thread(). This happens asynchroneously, in the background. I suspect the problem here is that we need to wait for the binlog background thread to have time to process any pending requests before we do binlog cleanup. At least if I add this at the end of the test case, it does not assert: --source include/wait_for_binlog_checkpoint.inc I'll take a closer look next week and find out what the real solution should be.
            Hide
            knielsen Kristian Nielsen added a comment -

            Fix pushed to 10.0-base (will be later merged to 10.0).

            Show
            knielsen Kristian Nielsen added a comment - Fix pushed to 10.0-base (will be later merged to 10.0).
            Hide
            knielsen Kristian Nielsen added a comment -

            Here is the patch for the bug, including test case:

            http://lists.askmonty.org/pipermail/commits/2013-March/004483.html

            Show
            knielsen Kristian Nielsen added a comment - Here is the patch for the bug, including test case: http://lists.askmonty.org/pipermail/commits/2013-March/004483.html

              People

              • Assignee:
                knielsen Kristian Nielsen
                Reporter:
                pivanof Pavel Ivanov
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0 minutes
                  0m
                  Logged:
                  Time Spent - 4 hours
                  4h