Details
-
Type:
Bug
-
Status: Closed
-
Priority:
Major
-
Resolution: Fixed
-
Affects Version/s: 10.0.12
-
Fix Version/s: 10.0.14
-
Component/s: None
-
Labels:
Description
The problem is observed on binary logs attached to MDEV-6549.
- put the attached set of binary logs into the master datadir;
- start master server with log-bin=mysql-bin, other options can stay default;
- start slave with default options;
- start replication with MASTER_USE_GTID = current_pos (maybe GTID is not necessary in the scenario, I didn't try without it);
- after a while, replication fails here:
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 127.0.0.1
Master_User: root
Master_Port: 9306
Connect_Retry: 1
Master_Log_File: mysql-bin.002277
Read_Master_Log_Pos: 326
Relay_Log_File: wheezy-64-relay-bin.002025
Relay_Log_Pos: 613
Relay_Master_Log_File: mysql-bin.001011
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1062
Last_Error: Error 'Duplicate entry '15' for key 'PRIMARY'' on query. Default database: 'test1'. Query: 'INSERT INTO test1 . /* table10_int table0_int t2_temp_myisam_25855 */ table10_innodb_int ( `col_bigint`,`col_bigint_unsigned`,`col_timestamp`,`col_tinyint`,`col_tinyint_unsigned`,`col_varchar_257_binary`,`col_varchar_257_latin1`,`col_varchar_257_utf8`,`pk` ) SELECT `col_bigint`,`col_bigint_unsigned`,`col_timestamp`,`col_tinyint`,`col_tinyint_unsigned`,`col_varchar_257_binary`,`col_varchar_257_latin1`,`col_varchar_257_utf8`,`pk` FROM test1 . /* t2_temp_innodb_25855 table0_int_autoinc table1_myisam_int_autoinc */ table10_innodb_int AS A WHERE col_tinyint BETWEEN -82 AND -82 + 2 /**/ AND ( `col_varchar_257_binary` IS NULL OR `col_varchar_257_binary` <> ROW_COUNT() ) UNION SELECT `col_bigint`,`col_bigint_unsigned`,`col_timestamp`,`col_tinyint`,`col_tinyint_unsigned`,`col_varchar_257_binary`,`col_varchar_257_latin1`,`col_varchar_257_utf8`,`pk` FROM ( SELECT `col_bigint`,`col_bigint_unsigned`,`col_timestamp`,`c
Skip_Counter: 0
Exec_Master_Log_Pos: 326
Relay_Log_Space: 8561745
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1062
Last_SQL_Error: Error 'Duplicate entry '15' for key 'PRIMARY'' on query. Default database: 'test1'. Query: 'INSERT INTO test1 . /* table10_int table0_int t2_temp_myisam_25855 */ table10_innodb_int ( `col_bigint`,`col_bigint_unsigned`,`col_timestamp`,`col_tinyint`,`col_tinyint_unsigned`,`col_varchar_257_binary`,`col_varchar_257_latin1`,`col_varchar_257_utf8`,`pk` ) SELECT `col_bigint`,`col_bigint_unsigned`,`col_timestamp`,`col_tinyint`,`col_tinyint_unsigned`,`col_varchar_257_binary`,`col_varchar_257_latin1`,`col_varchar_257_utf8`,`pk` FROM test1 . /* t2_temp_innodb_25855 table0_int_autoinc table1_myisam_int_autoinc */ table10_innodb_int AS A WHERE col_tinyint BETWEEN -82 AND -82 + 2 /**/ AND ( `col_varchar_257_binary` IS NULL OR `col_varchar_257_binary` <> ROW_COUNT() ) UNION SELECT `col_bigint`,`col_bigint_unsigned`,`col_timestamp`,`col_tinyint`,`col_tinyint_unsigned`,`col_varchar_257_binary`,`col_varchar_257_latin1`,`col_varchar_257_utf8`,`pk` FROM ( SELECT `col_bigint`,`col_bigint_unsigned`,`col_timestamp`,`c
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: Current_Pos
Gtid_IO_Pos: 0-1-11874
The failure might be caused by an old replication bug or an erroneous test, it is not the point here.
Trying to run start slave again does not help, it still aborts at the same position (naturally).
Then,
- run set global slave_parallel_threads=1;
- run start slave
- it fails again, but note that the position has advanced:
MariaDB [test]> show slave status \G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 127.0.0.1 Master_User: root Master_Port: 9306 Connect_Retry: 1 Master_Log_File: mysql-bin.002277 Read_Master_Log_Pos: 326 Relay_Log_File: wheezy-64-relay-bin.002027 Relay_Log_Pos: 613 Relay_Master_Log_File: mysql-bin.001012 Slave_IO_Running: Yes Slave_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 1062 Last_Error: Error 'Duplicate entry '15' for key 'PRIMARY'' on query. Default database: 'test1'. Query: 'INSERT INTO test1 . /* table10_int table0_int t2_temp_myisam_25855 */ table10_innodb_int ( `col_bigint`,`col_bigint_unsigned`,`col_timestamp`,`col_tinyint`,`col_tinyint_unsigned`,`col_varchar_257_binary`,`col_varchar_257_latin1`,`col_varchar_257_utf8`,`pk` ) SELECT `col_bigint`,`col_bigint_unsigned`,`col_timestamp`,`col_tinyint`,`col_tinyint_unsigned`,`col_varchar_257_binary`,`col_varchar_257_latin1`,`col_varchar_257_utf8`,`pk` FROM test1 . /* t2_temp_innodb_25855 table0_int_autoinc table1_myisam_int_autoinc */ table10_innodb_int AS A WHERE col_tinyint BETWEEN -82 AND -82 + 2 /**/ AND ( `col_varchar_257_binary` IS NULL OR `col_varchar_257_binary` <> ROW_COUNT() ) UNION SELECT `col_bigint`,`col_bigint_unsigned`,`col_timestamp`,`col_tinyint`,`col_tinyint_unsigned`,`col_varchar_257_binary`,`col_varchar_257_latin1`,`col_varchar_257_utf8`,`pk` FROM ( SELECT `col_bigint`,`col_bigint_unsigned`,`col_timestamp`,`c Skip_Counter: 0 Exec_Master_Log_Pos: 326 Relay_Log_Space: 8562815 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 1062 Last_SQL_Error: Error 'Duplicate entry '15' for key 'PRIMARY'' on query. Default database: 'test1'. Query: 'INSERT INTO test1 . /* table10_int table0_int t2_temp_myisam_25855 */ table10_innodb_int ( `col_bigint`,`col_bigint_unsigned`,`col_timestamp`,`col_tinyint`,`col_tinyint_unsigned`,`col_varchar_257_binary`,`col_varchar_257_latin1`,`col_varchar_257_utf8`,`pk` ) SELECT `col_bigint`,`col_bigint_unsigned`,`col_timestamp`,`col_tinyint`,`col_tinyint_unsigned`,`col_varchar_257_binary`,`col_varchar_257_latin1`,`col_varchar_257_utf8`,`pk` FROM test1 . /* t2_temp_innodb_25855 table0_int_autoinc table1_myisam_int_autoinc */ table10_innodb_int AS A WHERE col_tinyint BETWEEN -82 AND -82 + 2 /**/ AND ( `col_varchar_257_binary` IS NULL OR `col_varchar_257_binary` <> ROW_COUNT() ) UNION SELECT `col_bigint`,`col_bigint_unsigned`,`col_timestamp`,`col_tinyint`,`col_tinyint_unsigned`,`col_varchar_257_binary`,`col_varchar_257_latin1`,`col_varchar_257_utf8`,`pk` FROM ( SELECT `col_bigint`,`col_bigint_unsigned`,`col_timestamp`,`c Replicate_Ignore_Server_Ids: Master_Server_Id: 1 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: Current_Pos Gtid_IO_Pos: 0-1-11874 1 row in set (0.00 sec)
- run start slave yet another time. Now, the replication goes on:
MariaDB [test]> show slave status \G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 127.0.0.1 Master_User: root Master_Port: 9306 Connect_Retry: 1 Master_Log_File: mysql-bin.002277 Read_Master_Log_Pos: 326 Relay_Log_File: wheezy-64-relay-bin.003535 Relay_Log_Pos: 2332 Relay_Master_Log_File: mysql-bin.001728 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 2045 Relay_Log_Space: 3094464 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 397789 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 1 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: Current_Pos Gtid_IO_Pos: 0-1-11874
I don't think it's supposed to happen.
The binary logs were produced using 10.0 tree on Power 8 machine, and weren't anyhow tampered with.
The test is executed on a usual Debian machine, with this server:
revision-id: sergii@pisem.net-20140805163520-tz0w14h8uju0ti9n revno: 4330 branch-nick: 10.0
Gliffy Diagrams
Attachments
Issue Links
- relates to
-
MDEV-6589 Incorrect relay log start position when restarting SQL thread after error in parallel replication
-
- Closed
-
Activity
- All
- Comments
- Work Log
- History
- Activity
- Transitions
I was able to reproduce as described. Looks like a real issue.
The difference in SHOW SLAVE STATUS after the first restart with parallel replication is the Relay_Master_Log_File:
< Relay_Master_Log_File: mysql-bin.001011
—
> Relay_Master_Log_File: mysql-bin.001012
I suspect this requires that the IO thread not be restarted to reproduce, which may explain why it wasn't caught before. I will investigate.