Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-7888

ANALYZE TABLE does wakeup_subsequent_commits(), causing wrong binlog order and parallel replication hang

    Details

      Description

      All threads seem to be waiting for some 'prior transaction' which is not there.

      MariaDB [(none)]> show processlist;
      +----+-------------+-----------------+-------+---------+------+--------------------------------------------------------------------------------+---------------------------------------------------+----------+
      | Id | User        | Host            | db    | Command | Time | State                                                                          | Info                                              | Progress |
      +----+-------------+-----------------+-------+---------+------+--------------------------------------------------------------------------------+---------------------------------------------------+----------+
      |  2 | system user |                 | NULL  | Connect | 6220 | Waiting for prior transaction to start commit before starting next transaction | NULL                                              |    0.000 |
      |  3 | system user |                 | NULL  | Connect | 6033 | Waiting for prior transaction to start commit before starting next transaction | NULL                                              |    0.000 |
      |  4 | system user |                 | NULL  | Connect | 6220 | Waiting for prior transaction to start commit before starting next transaction | NULL                                              |    0.000 |
      |  5 | system user |                 | NULL  | Connect | 6063 | Waiting for prior transaction to start commit before starting next transaction | NULL                                              |    0.000 |
      |  6 | system user |                 | NULL  | Connect | 6035 | Waiting for prior transaction to start commit before starting next transaction | NULL                                              |    0.000 |
      |  7 | system user |                 | NULL  | Connect | 6065 | Waiting for prior transaction to start commit before starting next transaction | NULL                                              |    0.000 |
      |  8 | system user |                 | NULL  | Connect | 6063 | Waiting for prior transaction to start commit before starting next transaction | NULL                                              |    0.000 |
      |  9 | system user |                 | NULL  | Connect | 6064 | Waiting for prior transaction to start commit before starting next transaction | NULL                                              |    0.000 |
      | 10 | system user |                 | NULL  | Connect | 6035 | Waiting for prior transaction to start commit before starting next transaction | NULL                                              |    0.000 |
      | 11 | system user |                 | NULL  | Connect | 6122 | Waiting for prior transaction to start commit before starting next transaction | NULL                                              |    0.000 |
      | 12 | system user |                 | NULL  | Connect | 6220 | Waiting for prior transaction to start commit before starting next transaction | NULL                                              |    0.000 |
      | 13 | system user |                 | NULL  | Connect | 6220 | Waiting for prior transaction to start commit before starting next transaction | NULL                                              |    0.000 |
      | 14 | system user |                 | NULL  | Connect | 6122 | Waiting for prior transaction to start commit before starting next transaction | NULL                                              |    0.000 |
      | 15 | system user |                 | NULL  | Connect | 6220 | Waiting for prior transaction to start commit before starting next transaction | NULL                                              |    0.000 |
      | 16 | system user |                 | NULL  | Connect | 6064 | Waiting for prior transaction to start commit before starting next transaction | NULL                                              |    0.000 |
      | 17 | system user |                 | NULL  | Connect | 6064 | Waiting for prior transaction to start commit before starting next transaction | NULL                                              |    0.000 |
      | 18 | system user |                 | NULL  | Connect | 6064 | Waiting for prior transaction to start commit before starting next transaction | NULL                                              |    0.000 |
      | 19 | system user |                 | NULL  | Connect | 6122 | Waiting for prior transaction to start commit before starting next transaction | NULL                                              |    0.000 |
      | 20 | system user |                 | NULL  | Connect | 6064 | Waiting for prior transaction to start commit before starting next transaction | NULL                                              |    0.000 |
      | 21 | system user |                 | NULL  | Connect | 6122 | Waiting for prior transaction to start commit before starting next transaction | NULL                                              |    0.000 |
      | 22 | root        | localhost:46784 | mysql | Query   | 5863 | Waiting for the slave SQL thread to advance position                           | SELECT MASTER_POS_WAIT('mysql-bin.000001',168906) |    0.000 |
      | 23 | system user |                 | NULL  | Connect | 6279 | Waiting for master to send event                                               | NULL                                              |    0.000 |
      | 24 | system user |                 | NULL  | Connect | 6279 | Slave has read all relay log; waiting for the slave I/O thread to update it    | NULL                                              |    0.000 |
      | 42 | root        | localhost:47011 | NULL  | Query   |    0 | init                                                                           | show processlist                                  |    0.000 |
      +----+-------------+-----------------+-------+---------+------+--------------------------------------------------------------------------------+---------------------------------------------------+----------+
      
      MariaDB [(none)]> show slave status \G
      *************************** 1. row ***************************
                     Slave_IO_State: Waiting for master to send event
                        Master_Host: 127.0.0.1
                        Master_User: root
                        Master_Port: 10730
                      Connect_Retry: 1
                    Master_Log_File: mysql-bin.000001
                Read_Master_Log_Pos: 168906
                     Relay_Log_File: mysql-relay-bin.000002
                      Relay_Log_Pos: 127819
              Relay_Master_Log_File: mysql-bin.000001
                   Slave_IO_Running: Yes
                  Slave_SQL_Running: Yes
                    Replicate_Do_DB: 
                Replicate_Ignore_DB: 
                 Replicate_Do_Table: 
             Replicate_Ignore_Table: 
            Replicate_Wild_Do_Table: 
        Replicate_Wild_Ignore_Table: 
                         Last_Errno: 0
                         Last_Error: 
                       Skip_Counter: 0
                Exec_Master_Log_Pos: 127532
                    Relay_Log_Space: 169490
                    Until_Condition: None
                     Until_Log_File: 
                      Until_Log_Pos: 0
                 Master_SSL_Allowed: No
                 Master_SSL_CA_File: 
                 Master_SSL_CA_Path: 
                    Master_SSL_Cert: 
                  Master_SSL_Cipher: 
                     Master_SSL_Key: 
              Seconds_Behind_Master: 6063
      Master_SSL_Verify_Server_Cert: No
                      Last_IO_Errno: 0
                      Last_IO_Error: 
                     Last_SQL_Errno: 0
                     Last_SQL_Error: 
        Replicate_Ignore_Server_Ids: 
                   Master_Server_Id: 1
                     Master_SSL_Crl: 
                 Master_SSL_Crlpath: 
                         Using_Gtid: Slave_Pos
                        Gtid_IO_Pos: 0-1-483
            Replicate_Do_Domain_Ids: 
        Replicate_Ignore_Domain_Ids: 
                      Parallel_Mode: optimistic
      1 row in set (0.00 sec)
      
      Fragment of the binlog, 100 lines before and 100 lines after the current Exec_Master_Log_Pos: 127532
      /*!*/;
      # at 126456
      #150331  0:48:15 server id 1  end_log_pos 126484 	Intvar
      SET INSERT_ID=60/*!*/;
      # at 126484
      #150331  0:48:15 server id 1  end_log_pos 126621 	Query	thread_id=43	exec_time=1	error_code=0
      SET TIMESTAMP=1427752095/*!*/;
      REPLACE LOW_PRIORITY INTO `table0_innodb_int_autoinc` (`pk`) VALUES (NULL)
      /*!*/;
      # at 126621
      #150331  0:48:15 server id 1  end_log_pos 126648 	Xid = 2409
      COMMIT/*!*/;
      # at 126648
      #150331  0:48:14 server id 1  end_log_pos 126686 	GTID 0-1-375
      /*!100001 SET @@session.gtid_seq_no=375*//*!*/;
      BEGIN
      /*!*/;
      # at 126686
      # at 126760
      #150331  0:48:14 server id 1  end_log_pos 126760 	Table_map: `test`.`table100_innodb_int_autoinc` mapped to number 126
      #150331  0:48:14 server id 1  end_log_pos 126839 	Delete_rows: table id 126 flags: STMT_END_F
      ### DELETE FROM `test`.`table100_innodb_int_autoinc`
      ### WHERE
      ###   @1=26
      ###   @2=138
      ###   @3=NULL
      ###   @4=NULL
      ###   @5='4'
      ### DELETE FROM `test`.`table100_innodb_int_autoinc`
      ### WHERE
      ###   @1=7
      ###   @2=4
      ###   @3='was'
      ###   @4=245
      ###   @5='3'
      ### DELETE FROM `test`.`table100_innodb_int_autoinc`
      ### WHERE
      ###   @1=9
      ###   @2=6
      ###   @3='yeah'
      ###   @4=5
      ###   @5='3'
      # at 126839
      #150331  0:48:14 server id 1  end_log_pos 126866 	Xid = 2357
      COMMIT/*!*/;
      # at 126866
      #150331  0:48:14 server id 1  end_log_pos 126904 	GTID 0-1-376
      /*!100001 SET @@session.gtid_seq_no=376*//*!*/;
      BEGIN
      /*!*/;
      # at 126904
      #150331  0:48:14 server id 1  end_log_pos 126932 	Intvar
      SET INSERT_ID=45/*!*/;
      # at 126932
      #150331  0:48:14 server id 1  end_log_pos 127113 	Query	thread_id=35	exec_time=2	error_code=0
      SET TIMESTAMP=1427752094/*!*/;
      REPLACE LOW_PRIORITY INTO `table1_innodb_int_autoinc` ( `col_int_key`, `col_char_12` ) VALUES ( 'm', 2 ), ( 166, 238 )
      /*!*/;
      # at 127113
      #150331  0:48:14 server id 1  end_log_pos 127140 	Xid = 2378
      COMMIT/*!*/;
      # at 127140
      #150331  0:48:14 server id 1  end_log_pos 127178 	GTID 0-1-377
      /*!100001 SET @@session.gtid_seq_no=377*//*!*/;
      # at 127178
      #150331  0:48:14 server id 1  end_log_pos 127336 	Query	thread_id=37	exec_time=2	error_code=0
      SET TIMESTAMP=1427752094/*!*/;
      ANALYZE TABLE `table0_innodb_key_pk_parts_2_int_autoinc` PERSISTENT FOR COLUMNS ALL INDEXES ALL
      /*!*/;
      # at 127336
      #150331  0:48:16 server id 1  end_log_pos 127374 	GTID 0-1-378
      /*!100001 SET @@session.gtid_seq_no=378*//*!*/;
      BEGIN
      /*!*/;
      # at 127374
      # at 127446
      #150331  0:48:16 server id 1  end_log_pos 127446 	Table_map: `test`.`table2_innodb_int_autoinc` mapped to number 123
      #150331  0:48:16 server id 1  end_log_pos 127505 	Delete_rows: table id 123 flags: STMT_END_F
      ### DELETE FROM `test`.`table2_innodb_int_autoinc`
      ### WHERE
      ###   @1=NULL
      ###   @2=69
      ###   @3=NULL
      ###   @4=NULL
      ###   @5=NULL
      ### DELETE FROM `test`.`table2_innodb_int_autoinc`
      ### WHERE
      ###   @1=NULL
      ###   @2=70
      ###   @3=NULL
      ###   @4=NULL
      ###   @5=NULL
      ### DELETE FROM `test`.`table2_innodb_int_autoinc`
      ### WHERE
      ###   @1=NULL
      ###   @2=74
      ###   @3=2
      ###   @4='1404370944'
      ###   @5=NULL
      # at 127505
      #150331  0:48:16 server id 1  end_log_pos 127532 	Xid = 2435
      COMMIT/*!*/;
      # at 127532
      #150331  0:48:14 server id 1  end_log_pos 127570 	GTID 0-1-379
      /*!100001 SET @@session.gtid_seq_no=379*//*!*/;
      BEGIN
      /*!*/;
      # at 127570
      #150331  0:48:14 server id 1  end_log_pos 127874 	Query	thread_id=34	exec_time=2	error_code=0
      SET TIMESTAMP=1427752094/*!*/;
      CREATE OR REPLACE TABLE `create_or_replace_tmp` (
        `col_int` int(11) DEFAULT NULL,
        `col_int_key` int(11) DEFAULT NULL,
        `col_char_12` char(12) DEFAULT NULL,
        `pk` int(11) NOT NULL DEFAULT '0',
        `col_char_12_key` char(12) DEFAULT NULL
      )
      /*!*/;
      # at 127874
      # at 127942
      #150331  0:48:14 server id 1  end_log_pos 127942 	Table_map: `test`.`create_or_replace_tmp` mapped to number 137
      #150331  0:48:14 server id 1  end_log_pos 128275 	Write_rows: table id 137 flags: STMT_END_F
      ### INSERT INTO `test`.`create_or_replace_tmp`
      ### SET
      ###   @1=NULL
      ###   @2=NULL
      ###   @3='2147483647'
      ###   @4=11
      ###   @5=NULL
      ### INSERT INTO `test`.`create_or_replace_tmp`
      ### SET
      ###   @1=9
      ###   @2=NULL
      ###   @3=NULL
      ###   @4=15
      ###   @5='77987840'
      ### INSERT INTO `test`.`create_or_replace_tmp`
      ### SET
      ###   @1=0
      ###   @2=NULL
      ###   @3=NULL
      ###   @4=17
      ###   @5='77987840'
      ### INSERT INTO `test`.`create_or_replace_tmp`
      ### SET
      ###   @1=9
      ###   @2=8
      ###   @3=NULL
      ###   @4=25
      ###   @5='182'
      ### INSERT INTO `test`.`create_or_replace_tmp`
      ### SET
      ###   @1=NULL
      ###   @2=NULL
      ###   @3='13'
      ###   @4=43
      ###   @5=NULL
      ### INSERT INTO `test`.`create_or_replace_tmp`
      ### SET
      ###   @1=NULL
      ###   @2=NULL
      ###   @3='10'
      ###   @4=45
      ###   @5=NULL
      ### INSERT INTO `test`.`create_or_replace_tmp`
      ### SET
      ###   @1=NULL
      ###   @2=219
      ###   @3='211'
      ###   @4=47
      ###   @5=NULL
      ### INSERT INTO `test`.`create_or_replace_tmp`
      ### SET
      ###   @1=NULL
      ###   @2=NULL
      ###   @3='54'
      ###   @4=49
      ###   @5=NULL
      ### INSERT INTO `test`.`create_or_replace_tmp`
      ### SET
      ###   @1=NULL
      ###   @2=NULL
      ###   @3='55'
      ###   @4=51
      ###   @5=NULL
      ### INSERT INTO `test`.`create_or_replace_tmp`
      ### SET
      ###   @1=NULL
      ###   @2=NULL
      ###   @3='10'
      ###   @4=53
      ###   @5=NULL
      ### INSERT INTO `test`.`create_or_replace_tmp`
      ### SET
      ###   @1=9
      ###   @2=NULL
      ###   @3=NULL
      ###   @4=55
      ###   @5=NULL
      ### INSERT INTO `test`.`create_or_replace_tmp`
      ### SET
      ###   @1=NULL
      ###   @2=NULL
      

      Full binlog is attached as mysql-bin.000001.

      When I feed the same binlog to a slave via a master again, I get it stuck in a slightly different way:

      MariaDB [test]> show processlist;
      +----+-------------+-----------------+------+---------+------+--------------------------------------------------------------------------------+------------------+----------+
      | Id | User        | Host            | db   | Command | Time | State                                                                          | Info             | Progress |
      +----+-------------+-----------------+------+---------+------+--------------------------------------------------------------------------------+------------------+----------+
      |  2 | system user |                 | NULL | Connect | 9338 | Waiting for prior transaction to start commit before starting next transaction | NULL             |    0.000 |
      |  3 | system user |                 | NULL | Connect | 9331 | Waiting for prior transaction to start commit before starting next transaction | NULL             |    0.000 |
      |  4 | system user |                 | NULL | Connect | 9332 | Waiting for prior transaction to start commit before starting next transaction | NULL             |    0.000 |
      |  5 | system user |                 | NULL | Connect | 9331 | Waiting for prior transaction to start commit before starting next transaction | NULL             |    0.000 |
      |  6 | system user |                 | NULL | Connect | 9338 | Waiting for prior transaction to start commit before starting next transaction | NULL             |    0.000 |
      |  7 | system user |                 | NULL | Connect | 9333 | Waiting for prior transaction to start commit before starting next transaction | NULL             |    0.000 |
      |  8 | system user |                 | NULL | Connect | 9331 | Waiting for prior transaction to start commit before starting next transaction | NULL             |    0.000 |
      |  9 | system user |                 | NULL | Connect | 9333 | Waiting for prior transaction to start commit before starting next transaction | NULL             |    0.000 |
      | 10 | system user |                 | NULL | Connect | 9332 | Waiting for prior transaction to start commit before starting next transaction | NULL             |    0.000 |
      | 11 | system user |                 | NULL | Connect | 9333 | Waiting for prior transaction to start commit before starting next transaction | NULL             |    0.000 |
      | 12 | system user |                 | NULL | Connect | 9333 | Waiting for prior transaction to start commit before starting next transaction | NULL             |    0.000 |
      | 13 | system user |                 | NULL | Connect | 9342 | Waiting for prior transaction to start commit before starting next transaction | NULL             |    0.000 |
      | 14 | system user |                 | NULL | Connect | 9332 | Waiting for prior transaction to start commit before starting next transaction | NULL             |    0.000 |
      | 15 | system user |                 | NULL | Connect | 9332 | Waiting for prior transaction to start commit before starting next transaction | NULL             |    0.000 |
      | 16 | system user |                 | NULL | Connect | 9310 | Waiting for prior transaction to start commit before starting next transaction | NULL             |    0.000 |
      | 17 | system user |                 | NULL | Connect | 9331 | Waiting for prior transaction to start commit before starting next transaction | NULL             |    0.000 |
      | 18 | system user |                 | NULL | Connect | 9333 | Waiting for prior transaction to start commit before starting next transaction | NULL             |    0.000 |
      | 19 | system user |                 | NULL | Connect | 9330 | Waiting for prior transaction to start commit before starting next transaction | NULL             |    0.000 |
      | 20 | system user |                 | NULL | Connect | 9310 | Waiting for prior transaction to start commit before starting next transaction | NULL             |    0.000 |
      | 21 | system user |                 | NULL | Connect | 9310 | Waiting for prior transaction to start commit before starting next transaction | NULL             |    0.000 |
      | 23 | root        | localhost:49859 | test | Query   |    0 | init                                                                           | show processlist |    0.000 |
      | 24 | system user |                 | NULL | Connect |  292 | Waiting for master to send event                                               | NULL             |    0.000 |
      | 25 | system user |                 | NULL | Connect |  291 | Waiting for prior transaction to commit                                        | NULL             |    0.000 |
      +----+-------------+-----------------+------+---------+------+--------------------------------------------------------------------------------+------------------+----------+
      
      MariaDB [test]> show slave status \G
      *************************** 1. row ***************************
                     Slave_IO_State: Waiting for master to send event
                        Master_Host: 127.0.0.1
                        Master_User: root
                        Master_Port: 10730
                      Connect_Retry: 1
                    Master_Log_File: mysql-bin.000002
                Read_Master_Log_Pos: 326
                     Relay_Log_File: wheezy-64-relay-bin.000002
                      Relay_Log_Pos: 36518
              Relay_Master_Log_File: mysql-bin.000001
                   Slave_IO_Running: Yes
                  Slave_SQL_Running: Yes
                    Replicate_Do_DB: 
                Replicate_Ignore_DB: 
                 Replicate_Do_Table: 
             Replicate_Ignore_Table: 
            Replicate_Wild_Do_Table: 
        Replicate_Wild_Ignore_Table: 
                         Last_Errno: 0
                         Last_Error: 
                       Skip_Counter: 0
                Exec_Master_Log_Pos: 36231
                    Relay_Log_Space: 169859
                    Until_Condition: None
                     Until_Log_File: 
                      Until_Log_Pos: 0
                 Master_SSL_Allowed: No
                 Master_SSL_CA_File: 
                 Master_SSL_CA_Path: 
                    Master_SSL_Cert: 
                  Master_SSL_Cipher: 
                     Master_SSL_Key: 
              Seconds_Behind_Master: 9331
      Master_SSL_Verify_Server_Cert: No
                      Last_IO_Errno: 0
                      Last_IO_Error: 
                     Last_SQL_Errno: 0
                     Last_SQL_Error: 
        Replicate_Ignore_Server_Ids: 
                   Master_Server_Id: 1
                     Master_SSL_Crl: 
                 Master_SSL_Crlpath: 
                         Using_Gtid: Slave_Pos
                        Gtid_IO_Pos: 0-1-483
            Replicate_Do_Domain_Ids: 
        Replicate_Ignore_Domain_Ids: 
                      Parallel_Mode: optimistic
      1 row in set (0.01 sec)
      
      Similar 201-line fragment from the binlog, for position 36231
      /*!*/;
      # at 35314
      #150331  0:44:33 server id 1  end_log_pos 35341 	Xid = 434
      COMMIT/*!*/;
      # at 35341
      #150331  0:44:54 server id 1  end_log_pos 35379 	GTID 0-1-104
      /*!100001 SET @@session.gtid_seq_no=104*//*!*/;
      BEGIN
      /*!*/;
      # at 35379
      # at 35451
      #150331  0:44:54 server id 1  end_log_pos 35451 	Table_map: `test`.`table1_innodb_int_autoinc` mapped to number 120
      #150331  0:44:54 server id 1  end_log_pos 35504 	Update_rows: table id 120 flags: STMT_END_F
      ### UPDATE `test`.`table1_innodb_int_autoinc`
      ### WHERE
      ###   @1='91'
      ###   @2=6
      ###   @3=NULL
      ###   @4=NULL
      ###   @5=2147483647
      ### SET
      ###   @1='5'
      ###   @2=6
      ###   @3=NULL
      ###   @4=NULL
      ###   @5=2147483647
      # at 35504
      #150331  0:44:54 server id 1  end_log_pos 35531 	Xid = 508
      COMMIT/*!*/;
      # at 35531
      #150331  0:44:54 server id 1  end_log_pos 35569 	GTID 0-1-105
      /*!100001 SET @@session.gtid_seq_no=105*//*!*/;
      BEGIN
      /*!*/;
      # at 35569
      #150331  0:44:54 server id 1  end_log_pos 35597 	Intvar
      SET INSERT_ID=8/*!*/;
      # at 35597
      #150331  0:44:54 server id 1  end_log_pos 35776 	Query	thread_id=46	exec_time=2	error_code=0
      SET TIMESTAMP=1427751894/*!*/;
      REPLACE LOW_PRIORITY INTO `table1_innodb_int_autoinc` ( `col_int_key`, `col_int` ) VALUES ( 'v', 'c' ), ( 'v', 106 )
      /*!*/;
      # at 35776
      #150331  0:44:54 server id 1  end_log_pos 35803 	Xid = 520
      COMMIT/*!*/;
      # at 35803
      #150331  0:44:54 server id 1  end_log_pos 35841 	GTID 0-1-106
      /*!100001 SET @@session.gtid_seq_no=106*//*!*/;
      BEGIN
      /*!*/;
      # at 35841
      # at 35913
      #150331  0:44:54 server id 1  end_log_pos 35913 	Table_map: `test`.`table1_innodb_int_autoinc` mapped to number 120
      #150331  0:44:54 server id 1  end_log_pos 36021 	Update_rows: table id 120 flags: STMT_END_F
      ### UPDATE `test`.`table1_innodb_int_autoinc`
      ### WHERE
      ###   @1=NULL
      ###   @2=8
      ###   @3=NULL
      ###   @4=0
      ###   @5=0
      ### SET
      ###   @1=NULL
      ###   @2=8
      ###   @3=NULL
      ###   @4=0
      ###   @5=2147483647
      ### UPDATE `test`.`table1_innodb_int_autoinc`
      ### WHERE
      ###   @1=NULL
      ###   @2=9
      ###   @3=NULL
      ###   @4=106
      ###   @5=0
      ### SET
      ###   @1=NULL
      ###   @2=9
      ###   @3=NULL
      ###   @4=106
      ###   @5=2147483647
      ### UPDATE `test`.`table1_innodb_int_autoinc`
      ### WHERE
      ###   @1='159'
      ###   @2=7
      ###   @3=NULL
      ###   @4=NULL
      ###   @5=0
      ### SET
      ###   @1='159'
      ###   @2=7
      ###   @3=NULL
      ###   @4=NULL
      ###   @5=2147483647
      # at 36021
      #150331  0:44:54 server id 1  end_log_pos 36048 	Xid = 519
      COMMIT/*!*/;
      # at 36048
      #150331  0:44:34 server id 1  end_log_pos 36086 	GTID 0-1-107
      /*!100001 SET @@session.gtid_seq_no=107*//*!*/;
      # at 36086
      #150331  0:44:34 server id 1  end_log_pos 36231 	Query	thread_id=37	exec_time=30	error_code=0
      SET TIMESTAMP=1427751874/*!*/;
      ANALYZE TABLE `table100_innodb_int_autoinc` PERSISTENT FOR COLUMNS ( ) INDEXES ALL
      /*!*/;
      # at 36231
      #150331  0:45:04 server id 1  end_log_pos 36269 	GTID 0-1-108
      /*!100001 SET @@session.gtid_seq_no=108*//*!*/;
      # at 36269
      #150331  0:45:04 server id 1  end_log_pos 36423 	Query	thread_id=34	exec_time=1	error_code=0
      SET TIMESTAMP=1427751904/*!*/;
      SET @@session.pseudo_thread_id=34/*!*/;
      CREATE OR REPLACE TEMPORARY TABLE `create_or_replace_tmp` LIKE `table10_innodb_int_autoinc`
      /*!*/;
      # at 36423
      #150331  0:45:05 server id 1  end_log_pos 36461 	GTID 0-1-109
      /*!100001 SET @@session.gtid_seq_no=109*//*!*/;
      BEGIN
      /*!*/;
      # at 36461
      #150331  0:44:30 server id 1  end_log_pos 36489 	Intvar
      SET INSERT_ID=11/*!*/;
      # at 36489
      #150331  0:44:30 server id 1  end_log_pos 36641 	Query	thread_id=44	exec_time=0	error_code=0
      SET TIMESTAMP=1427751870/*!*/;
      INSERT LOW_PRIORITY INTO `table10_innodb_key_pk_parts_2_int_autoinc` (`pk`) VALUES (NULL)
      /*!*/;
      # at 36641
      #150331  0:44:33 server id 1  end_log_pos 36669 	Intvar
      SET INSERT_ID=6/*!*/;
      # at 36669
      #150331  0:44:33 server id 1  end_log_pos 36820 	Query	thread_id=44	exec_time=0	error_code=0
      SET TIMESTAMP=1427751873/*!*/;
      INSERT LOW_PRIORITY INTO `table1_innodb_key_pk_parts_2_int_autoinc` (`pk`) VALUES (NULL)
      /*!*/;
      # at 36820
      # at 36894
      #150331  0:44:33 server id 1  end_log_pos 36894 	Table_map: `test`.`table100_innodb_int_autoinc` mapped to number 117
      #150331  0:44:33 server id 1  end_log_pos 37009 	Update_rows: table id 117 flags: STMT_END_F
      ### UPDATE `test`.`table100_innodb_int_autoinc`
      ### WHERE
      ###   @1=NULL
      ###   @2=19
      ###   @3='did'
      ###   @4=3
      ###   @5='eeeif'
      ### SET
      ###   @1=NULL
      ###   @2=19
      ###   @3='did'
      ###   @4=3
      ###   @5='j'
      ### UPDATE `test`.`table100_innodb_int_autoinc`
      ### WHERE
      ###   @1=9
      ###   @2=6
      ###   @3='yeah'
      ###   @4=1
      ###   @5='yuxleyeiieee'
      ### SET
      ###   @1=9
      ###   @2=6
      ###   @3='yeah'
      ###   @4=1
      ###   @5='j'
      # at 37009
      #150331  0:44:34 server id 1  end_log_pos 37037 	Intvar
      SET INSERT_ID=105/*!*/;
      # at 37037
      #150331  0:44:34 server id 1  end_log_pos 37182 	Query	thread_id=44	exec_time=0	error_code=0
      SET TIMESTAMP=1427751874/*!*/;
      INSERT LOW_PRIORITY IGNORE INTO `table100_innodb_int_autoinc` (`pk`) VALUES (NULL)
      /*!*/;
      # at 37182
      # at 37270
      #150331  0:44:34 server id 1  end_log_pos 37270 	Table_map: `test`.`table10_innodb_key_pk_parts_2_int_autoinc` mapped to number 121
      #150331  0:44:34 server id 1  end_log_pos 37482 	Update_rows: table id 121 flags: STMT_END_F
      ### UPDATE `test`.`table10_innodb_key_pk_parts_2_int_autoinc`
      ### WHERE
      ###   @1=NULL
      ###   @2=NULL
      ###   @3=NULL
      ###   @4=NULL
      ###   @5=11
      ### SET
      ###   @1=NULL
      ###   @2=NULL
      ###   @3=NULL
      ###   @4='183'
      ###   @5=11
      ### UPDATE `test`.`table10_innodb_key_pk_parts_2_int_autoinc`
      ### WHERE
      ###   @1='7'
      ###   @2=2
      ###   @3=NULL
      ###   @4='bgnesyuxle'
      ###   @5=6
      ### SET
      ###   @1='7'
      ###   @2=2
      ###   @3=NULL
      ###   @4='183'
      ###   @5=6
      ### UPDATE `test`.`table10_innodb_key_pk_parts_2_int_autoinc`
      ### WHERE
      ###   @1='7'
      

      Again, bad things seem to happen around CREATE OR REPLACE TABLE, so maybe it's related.

      To reproduce:

      • start master with the provided binary log, no special options needed
        My master command line options
        --server_id=1 --log-bin=mysql-bin 
        
      • start slave with parallel threads and optimistic mode
        My slave command line options
        --server_id=2 --slave-parallel-mode=optimistic --slave-parallel-threads=20
        
      • start replication (with or without GTID, doesn't matter)
      • wait until it gets stuck, for me it's always at Exec_Master_Log_Pos: 36231.

      Not reproducible without the optimistic mode.

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

              Hide
              knielsen Kristian Nielsen added a comment -

              Ok, the problem here seems to be ANALYZE TABLE. It does ha_trans_commit()
              during its processing, which incorrectly makes the code think that the
              statement has completed and binlogged on the slave (which it has not). This
              then causes out-of-order binlogging and further grief.

              (The bug should be in 10.0 / "conservative" mode also, but probably harder to
              trigger).

              So a work-around is to not binlog ANALYZE TABLE, meanwhile I'll find a way to
              fix it properly...

              Show
              knielsen Kristian Nielsen added a comment - Ok, the problem here seems to be ANALYZE TABLE. It does ha_trans_commit() during its processing, which incorrectly makes the code think that the statement has completed and binlogged on the slave (which it has not). This then causes out-of-order binlogging and further grief. (The bug should be in 10.0 / "conservative" mode also, but probably harder to trigger). So a work-around is to not binlog ANALYZE TABLE, meanwhile I'll find a way to fix it properly...
              Hide
              elenst Elena Stepanova added a comment -

              Just keep in mind that ANALYZE TABLE, among other things, might affect persistent statistics. I'm not saying that not binlogging it is a bad idea (in fact, I was always wondering why it was binlogged, and we just recently had a request from Daniel Black for not doing it), but it can stop collecting persistent statistics on the slave in setups which now rely on it.

              Show
              elenst Elena Stepanova added a comment - Just keep in mind that ANALYZE TABLE, among other things, might affect persistent statistics. I'm not saying that not binlogging it is a bad idea (in fact, I was always wondering why it was binlogged, and we just recently had a request from Daniel Black for not doing it), but it can stop collecting persistent statistics on the slave in setups which now rely on it.
              Hide
              danblack Daniel Black added a comment - - edited

              clarifying, I was more after an option to binlog the results of the analyze table but not the command itself. Analyze table already has NO_WRITE_TO_BINLOG | LOCAL. (MDEV-7829)

              Show
              danblack Daniel Black added a comment - - edited clarifying, I was more after an option to binlog the results of the analyze table but not the command itself. Analyze table already has NO_WRITE_TO_BINLOG | LOCAL. ( MDEV-7829 )
              Show
              knielsen Kristian Nielsen added a comment - http://lists.askmonty.org/pipermail/commits/2015-April/007723.html
              Show
              knielsen Kristian Nielsen added a comment - 10.1-specific part: http://lists.askmonty.org/pipermail/commits/2015-April/007724.html
              Hide
              elenst Elena Stepanova added a comment -

              Interestingly, I got this problem a lot with optimistic mode, but haven't got it even one time with the conservative mode, otherwise the same flow and options. So, apparently there is something about the optimistic mode that seriously increases the probability.

              Show
              elenst Elena Stepanova added a comment - Interestingly, I got this problem a lot with optimistic mode, but haven't got it even one time with the conservative mode, otherwise the same flow and options. So, apparently there is something about the optimistic mode that seriously increases the probability.
              Hide
              knielsen Kristian Nielsen added a comment -

              Yes, looking more, I was not actually able to trigger the observed problem directly in 10.0 (or in 10.1 conservative mode).

              The logical problem exists in the code in 10.0 (and I have fixed it there). However, ANALYZE cannot actually be binlogged in the same group-commit as another event group, so it may not be possible to trigger it directly in conservative mode. (In the test case, I artificially simulate same group commit using DBUG injection).

              So I agree this particular hang may not be possible to get in 10.0/conservative. I think maybe another consequence of the bug could be observable in 10.1 / conservative: out-of-order binlogging (different binlog order on slave compared to master). But I haven't actually checked that this is possible, and if it is the window of opportunity for the race should be very small, so hard to trigger.

              Show
              knielsen Kristian Nielsen added a comment - Yes, looking more, I was not actually able to trigger the observed problem directly in 10.0 (or in 10.1 conservative mode). The logical problem exists in the code in 10.0 (and I have fixed it there). However, ANALYZE cannot actually be binlogged in the same group-commit as another event group, so it may not be possible to trigger it directly in conservative mode. (In the test case, I artificially simulate same group commit using DBUG injection). So I agree this particular hang may not be possible to get in 10.0/conservative. I think maybe another consequence of the bug could be observable in 10.1 / conservative: out-of-order binlogging (different binlog order on slave compared to master). But I haven't actually checked that this is possible, and if it is the window of opportunity for the race should be very small, so hard to trigger.

                People

                • Assignee:
                  knielsen Kristian Nielsen
                  Reporter:
                  elenst Elena Stepanova
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  2 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved: