Details
-
Type:
Bug
-
Status: Closed
-
Priority:
Critical
-
Resolution: Fixed
-
Affects Version/s: 10.0.10
-
Fix Version/s: 10.0.13
-
Component/s: None
-
Labels:
Description
The attached binlog was generated during a normal concurrent test, no tampering has been done.
In case it is relevant, the master was run with the following options:
mysqld --loose-slave_ddl_exec_mode=STRICT \ --loose-rpl_semi_sync_slave_enabled=0 \ --loose-slave_parallel_threads=12 \ --core-file \ --max-allowed-packet=128Mb \ --general-log \ --log-bin=mysql-bin \ --sql-mode=no_engine_substitution \ --binlog_commit_wait_count=10 \ --binlog_commit_wait_usec=1000000 \ --plugin-load-add=semisync_master \ --plugin-load-add=semisync_slave \ --loose-rpl_semi_sync_master_timeout=600 \ --log-output=FILE \ --ignore-builtin-innodb \ --plugin-load-add=ha_innodb.so \ --slave-skip-errors=1049,1305,1539,1505
Binlog format was set to 'row' globally at runtime.
Note: Although the master command line contains slave-related options, it is just due to uniformity; master was not configured and used as a slave at any point of time.
Slave was initially run with identical parameters, but none of them, except for slave_parallel_threads, makes any difference. I reproduced the failure a number of times by feeding the same error log from a master to a new clean slave, and was able to discard all non-defaultoptions except for slave_parallel_threads.
It does not happen every time upon re-using the binary log, on average on my machine the failure happens once in 4-5 attempts, but it can depend on the machine of course.
Replication stops in several places, normally between GTID 0-1-604 and 0-1-611, but at least once it happened later. Below are extracts from error logs upon different failures that I encountered.
The failure happens both with master_use_gtid=current_pos and master_use_gtid=no.
Additional observations:
When replication stops, the last position is reported in the error log; but in SHOW SLAVE STATUS Exec_Master_Log_Pos points at the end of the master binlog, it looks wrong. See SHOW SLAVE STATUS output in examples below.
Sometimes SHOW SLAVE STATUS reports "Last_Error: Error 'Commit failed due to failure of an earlier commit on which this one depends' on query. Default database: 'test'. Query: 'COMMIT'" instead of the initial error.
In addition to the binary log, 3 error logs are attached:
- occurrence of the failure with GTID (slave_gtid.err)
- occurrence of the failure without GTID (slave_no_gtid.err)
- occurrence of the failure when the secondary error was reported by SHOW SLAVE STATUS (slave_gtid_wrong_error.err)
Examples of failure points:
end_log_pos 237673
stopped at position 237504; GTID position '0-1-611'
[ERROR] Slave SQL: Could not execute Write_rows_v1 event on table test.table10_key_pk_parts_2_int_autoinc; Deadlock found when trying to get lock; try restarting transaction, Error_code: 1213; handler error HA_ERR_LOCK_DEADLOCK; the event's master log mysql-bin.000001, end_log_pos 237673, Internal MariaDB error code: 1213
[ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000001' position 237504; GTID position '0-1-611'
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 127.0.0.1
Master_User: root
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 326
Relay_Log_File: wheezy-64-relay-bin.000003
Relay_Log_Pos: 237791
Relay_Master_Log_File: mysql-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1213
Last_Error: Could not execute Write_rows_v1 event on table test.table10_key_pk_parts_2_int_autoinc; Deadlock found when trying to get lock; try restarting transaction, Error_code: 1213; handler error HA_ERR_LOCK_DEADLOCK; the event's master log mysql-bin.000001, end_log_pos 237673
Skip_Counter: 0
Exec_Master_Log_Pos: 326
Relay_Log_Space: 583505
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1213
Last_SQL_Error: Could not execute Write_rows_v1 event on table test.table10_key_pk_parts_2_int_autoinc; Deadlock found when trying to get lock; try restarting transaction, Error_code: 1213; handler error HA_ERR_LOCK_DEADLOCK; the event's master log mysql-bin.000001, end_log_pos 237673
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: Current_Pos
Gtid_IO_Pos: 0-1-1556
end_log_pos 237673
stopped at position 235796; GTID position '0-1-604'
[ERROR] Slave SQL: Could not execute Write_rows_v1 event on table test.table10_key_pk_parts_2_int_autoinc; Deadlock found when trying to get lock; try restarting transaction, Error_code: 1213; handler error HA_ERR_LOCK_DEADLOCK; the event's master log mysql-bin.000001, end_log_pos 237673, Internal MariaDB error code: 1213
[ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000001' position 235796; GTID position '0-1-604'
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 127.0.0.1
Master_User: root
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 326
Relay_Log_File: wheezy-64-relay-bin.000003
Relay_Log_Pos: 237791
Relay_Master_Log_File: mysql-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1213
Last_Error: Could not execute Write_rows_v1 event on table test.table10_key_pk_parts_2_int_autoinc; Deadlock found when trying to get lock; try restarting transaction, Error_code: 1213; handler error HA_ERR_LOCK_DEADLOCK; the event's master log mysql-bin.000001, end_log_pos 237673
Skip_Counter: 0
Exec_Master_Log_Pos: 326
Relay_Log_Space: 583505
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1213
Last_SQL_Error: Could not execute Write_rows_v1 event on table test.table10_key_pk_parts_2_int_autoinc; Deadlock found when trying to get lock; try restarting transaction, Error_code: 1213; handler error HA_ERR_LOCK_DEADLOCK; the event's master log mysql-bin.000001, end_log_pos 237673
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: Current_Pos
Gtid_IO_Pos: 0-1-1556
end_log_pos 288531
stopped at position 288359; GTID position '0-1-738'
[ERROR] Slave SQL: Could not execute Write_rows_v1 event on table test.table1_key_pk_parts_2_int_autoinc; Deadlock found when trying to get lock; try restarting transaction, Error_code: 1213; handler error HA_ERR_LOCK_DEADLOCK; the event's master log mysql-bin.000001, end_log_pos 288531, Internal MariaDB error code: 1213
[ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000001' position 288359; GTID position '0-1-738'
[ERROR] Slave SQL: Error 'Commit failed due to failure of an earlier commit on which this one depends' on query. Default database: 'test'. Query: 'COMMIT', Internal MariaDB error code: 1964
MariaDB [test]> show slave status \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 127.0.0.1
Master_User: root
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 326
Relay_Log_File: wheezy-64-relay-bin.000003
Relay_Log_Pos: 288646
Relay_Master_Log_File: mysql-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1964
Last_Error: Error 'Commit failed due to failure of an earlier commit on which this one depends' on query. Default database: 'test'. Query: 'COMMIT'
Skip_Counter: 0
Exec_Master_Log_Pos: 326
Relay_Log_Space: 583505
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1964
Last_SQL_Error: Error 'Commit failed due to failure of an earlier commit on which this one depends' on query. Default database: 'test'. Query: 'COMMIT'
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: Current_Pos
Gtid_IO_Pos: 0-1-1556
end_log_pos 237435
stopped at position 237261; GTID position '0-1-610'
[ERROR] Slave SQL: Could not execute Write_rows_v1 event on table test.table10_key_pk_parts_2_int_autoinc; Deadlock found when trying to get lock; try restarting transaction, Error_code: 1213; handler error HA_ERR_LOCK_DEADLOCK; the event's master log mysql-bin.000001, end_log_pos 237435, Internal MariaDB error code: 1213
[ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000001' position 237261; GTID position '0-1-610'
[ERROR] Slave SQL: Error 'Commit failed due to failure of an earlier commit on which this one depends' on query. Default database: 'test'. Query: 'COMMIT', Internal MariaDB error code: 1964
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 127.0.0.1
Master_User: root
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 326
Relay_Log_File: wheezy-64-relay-bin.000003
Relay_Log_Pos: 237548
Relay_Master_Log_File: mysql-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1964
Last_Error: Error 'Commit failed due to failure of an earlier commit on which this one depends' on query. Default database: 'test'. Query: 'COMMIT'
Skip_Counter: 0
Exec_Master_Log_Pos: 326
Relay_Log_Space: 583505
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1964
Last_SQL_Error: Error 'Commit failed due to failure of an earlier commit on which this one depends' on query. Default database: 'test'. Query: 'COMMIT'
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: Current_Pos
Gtid_IO_Pos: 0-1-1556
end_log_pos 237435
stopped at position 236413; GTID position '0-1-606'
[ERROR] Slave SQL: Could not execute Write_rows_v1 event on table test.table10_key_pk_parts_2_int_autoinc; Deadlock found when trying to get lock; try restarting transaction, Error_code: 1213; handler error HA_ERR_LOCK_DEADLOCK; the event's master log mysql-bin.000001, end_log_pos 237435, Internal MariaDB error code: 1213
[ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000001' position 236413; GTID position '0-1-606'
[ERROR] Slave SQL: Error 'Commit failed due to failure of an earlier commit on which this one depends' on query. Default database: 'test'. Query: 'COMMIT', Internal MariaDB error code: 1964
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 127.0.0.1
Master_User: root
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 326
Relay_Log_File: wheezy-64-relay-bin.000003
Relay_Log_Pos: 237548
Relay_Master_Log_File: mysql-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1964
Last_Error: Error 'Commit failed due to failure of an earlier commit on which this one depends' on query. Default database: 'test'. Query: 'COMMIT'
Skip_Counter: 0
Exec_Master_Log_Pos: 326
Relay_Log_Space: 583505
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1964
Last_SQL_Error: Error 'Commit failed due to failure of an earlier commit on which this one depends' on query. Default database: 'test'. Query: 'COMMIT'
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: Current_Pos
Gtid_IO_Pos: 0-1-1556
end_log_pos 237673
stopped at position 237504 (GTID disabled)
[ERROR] Slave SQL: Could not execute Write_rows_v1 event on table test.table10_key_pk_parts_2_int_autoinc; Deadlock found when trying to get lock; try restarting transaction, Error_code: 1213; handler error HA_ERR_LOCK_DEADLOCK; the event's master log mysql-bin.000001, end_log_pos 237673, Internal MariaDB error code: 1213
[ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000001' position 237504
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 127.0.0.1
Master_User: root
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 326
Relay_Log_File: wheezy-64-relay-bin.000003
Relay_Log_Pos: 237791
Relay_Master_Log_File: mysql-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1213
Last_Error: Could not execute Write_rows_v1 event on table test.table10_key_pk_parts_2_int_autoinc; Deadlock found when trying to get lock; try restarting transaction, Error_code: 1213; handler error HA_ERR_LOCK_DEADLOCK; the event's master log mysql-bin.000001, end_log_pos 237673
Skip_Counter: 0
Exec_Master_Log_Pos: 326
Relay_Log_Space: 583505
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1213
Last_SQL_Error: Could not execute Write_rows_v1 event on table test.table10_key_pk_parts_2_int_autoinc; Deadlock found when trying to get lock; try restarting transaction, Error_code: 1213; handler error HA_ERR_LOCK_DEADLOCK; the event's master log mysql-bin.000001, end_log_pos 237673
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: No
Gtid_IO_Pos:
end_log_pos 237435
stopped at position 236836; GTID position '0-1-608'
[ERROR] Slave SQL: Could not execute Write_rows_v1 event on table test.table10_key_pk_parts_2_int_autoinc; Deadlock found when trying to get lock; try restarting transaction, Error_code: 1213; handler error HA_ERR_LOCK_DEADLOCK; the event's master log mysql-bin.000001, end_log_pos 237435, Internal MariaDB error code: 1213
[ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000001' position 236836; GTID position '0-1-608'
[ERROR] Slave SQL: Error 'Commit failed due to failure of an earlier commit on which this one depends' on query. Default database: 'test'. Query: 'COMMIT', Internal MariaDB error code: 1964
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 127.0.0.1
Master_User: root
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 326
Relay_Log_File: wheezy-64-relay-bin.000003
Relay_Log_Pos: 237548
Relay_Master_Log_File: mysql-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1964
Last_Error: Error 'Commit failed due to failure of an earlier commit on which this one depends' on query. Default database: 'test'. Query: 'COMMIT'
Skip_Counter: 0
Exec_Master_Log_Pos: 326
Relay_Log_Space: 583505
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1964
Last_SQL_Error: Error 'Commit failed due to failure of an earlier commit on which this one depends' on query. Default database: 'test'. Query: 'COMMIT'
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: Current_Pos
Gtid_IO_Pos: 0-1-1556
Gliffy Diagrams
Attachments
Issue Links
- relates to
-
MDEV-6067 Partitioned table DML sometimes binlogged without XID event
-
- Closed
-
Activity
- All
- Comments
- Work Log
- History
- Activity
- Transitions
> When replication stops, the last position is reported in the error log; but
> in SHOW SLAVE STATUS Exec_Master_Log_Pos points at the end of the master
> binlog, it looks wrong. See SHOW SLAVE STATUS output in examples below.
This particular issue is probably the same as
MDEV-5938, which should be fixednow.