Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-8104

Slave crashes in TABLE::init on replicating SBR flow involving temporary tables

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 10.1
    • Fix Version/s: 10.1.5
    • Component/s: Replication
    • Labels:
      None

      Description

      Note: It might be related to MDEV-8075 or even be a duplicate of it.

      Stack trace from 10.1 commit 539b3ca87
      #3  <signal handler called>
      #4  0x00007f5374b707f0 in TABLE::init (this=0x7f534ba42070, thd=0x7f5347823070, tl=0x7f534785f1d8) at 10.1/sql/table.cc:4054
      #5  0x00007f5374a22017 in open_temporary_table (thd=0x7f5347823070, tl=0x7f534785f1d8) at 10.1/sql/sql_base.cc:5982
      #6  0x00007f5374a220d5 in open_temporary_tables (thd=0x7f5347823070, tl_list=0x7f534785f1d8) at 10.1/sql/sql_base.cc:6017
      #7  0x00007f5374a85564 in mysql_execute_command (thd=0x7f5347823070) at 10.1/sql/sql_parse.cc:3836
      #8  0x00007f5374a8ffaa in mysql_parse (thd=0x7f5347823070, rawbuf=0x7f5346ab6579 "/* thread_id=19 cnt=121 */ INSERT INTO `table0_innodb_int_autoinc` ( `col_char_12`, `col_char_12_key` ) VALUES ( 12, 7 ), ( 'i', 211 )", length=134, parser_state=0x7f5370ffca60) at 10.1/sql/sql_parse.cc:7165
      #9  0x00007f5374da5c8c in Query_log_event::do_apply_event (this=0x7f5346ae6870, rgi=0x7f5346b00000, query_arg=0x7f5346ab6579 "/* thread_id=19 cnt=121 */ INSERT INTO `table0_innodb_int_autoinc` ( `col_char_12`, `col_char_12_key` ) VALUES ( 12, 7 ), ( 'i', 211 )", q_len_arg=134) at 10.1/sql/log_event.cc:4287
      #10 0x00007f5374da4f11 in Query_log_event::do_apply_event (this=0x7f5346ae6870, rgi=0x7f5346b00000) at 10.1/sql/log_event.cc:4013
      #11 0x00007f53749ed5da in Log_event::apply_event (this=0x7f5346ae6870, rgi=0x7f5346b00000) at 10.1/sql/log_event.h:1347
      #12 0x00007f53749e314f in apply_event_and_update_pos (ev=0x7f5346ae6870, thd=0x7f5347823070, rgi=0x7f5346b00000, rpt=0x7f534eec0dc0) at 10.1/sql/slave.cc:3274
      #13 0x00007f5374c1a043 in rpt_handle_event (qev=0x7f5346af3170, rpt=0x7f534eec0dc0) at 10.1/sql/rpl_parallel.cc:49
      #14 0x00007f5374c1c2ec in handle_rpl_parallel_thread (arg=0x7f534eec0dc0) at 10.1/sql/rpl_parallel.cc:942
      #15 0x00007f537416ab50 in start_thread (arg=<optimized out>) at pthread_create.c:304
      #16 0x00007f53721ff95d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
      
      Trying to get some variables.
      Some pointers may be invalid and cause the dump to abort.
      Query (0x7f5346ab6579): /* thread_id=19 cnt=121 */ INSERT INTO `table0_innodb_int_autoinc` ( `col_char_12`, `col_char_12_key` ) VALUES ( 12, 7 ), ( 'i', 211 )
      Connection ID (thread ID): 15
      Status: NOT_KILLED
      

      To reproduce:

      • start master with default options and --log-bin=mysql-bin, attached mysql-bin.* files;
      • start slave with --slave-parallel-mode=optimistic --slave-parallel-threads=20;
      • start replication;
      • wait.

      mysql.log – the master general log which corresponds the binlogs (to see what was happening).

      All queries except for initial ones have comments like /* thread_id=<thread_id> cnt=<query no> */ where thread_id is the connection id, and query no is the query counter inside this connection - should make it easier to search for the guilty query in the logs.

        Gliffy Diagrams

          Attachments

          1. mysql.log
            203 kB
          2. mysql-bin.000001
            242 kB
          3. mysql-bin.index
            0.0 kB
          4. mysql-bin.state
            0.0 kB

            Activity

            Hide
            knielsen Kristian Nielsen added a comment -

            I checked, at yes, it seems to be a duplicate of MDEV-8075.
            The binlog has a DROP TEMPORARY TABLE that is not marked as ddl, and then a
            following transaction tries to access a temporary table that is being closed
            in parallel. This causes the crash.

            The patch in MDEV-8075 should fix this crash as well.

            [Note that there seems to be another problem here. The quitting transaction
            binlogs two things: a DROP TEMPORARY TABLE and a BEGIN; ... ROLLBACK
            transaction. Surely these are logged in the wrong order, the DROP TABLE
            should be after the ROLLBACK-terminated incomplete transaction. But this
            seems unrelated to parallel replication.]

            Show
            knielsen Kristian Nielsen added a comment - I checked, at yes, it seems to be a duplicate of MDEV-8075 . The binlog has a DROP TEMPORARY TABLE that is not marked as ddl, and then a following transaction tries to access a temporary table that is being closed in parallel. This causes the crash. The patch in MDEV-8075 should fix this crash as well. [Note that there seems to be another problem here. The quitting transaction binlogs two things: a DROP TEMPORARY TABLE and a BEGIN; ... ROLLBACK transaction. Surely these are logged in the wrong order, the DROP TABLE should be after the ROLLBACK-terminated incomplete transaction. But this seems unrelated to parallel replication.]

              People

              • Assignee:
                knielsen Kristian Nielsen
                Reporter:
                elenst Elena Stepanova
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: