Details
-
Type:
Task
-
Status: Stalled
-
Priority:
Major
-
Resolution: Unresolved
-
Fix Version/s: 10.0
-
Component/s: Replication
-
Labels:None
Description
After thinking I was well on the way to recovering a slave:
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.006189
Read_Master_Log_Pos: 50082470
Relay_Log_File: mysqld-relay-bin.001144
Relay_Log_Pos: 104857605
Relay_Master_Log_File: mysql-bin.006105
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 9258445
Relay_Log_Space: 8480309601
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 963719
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 12022
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: No
Gtid_IO_Pos:
1 row in set (0.00 sec)
(interval < 30 sec)
stop slave;
set global sync_binlog=0, global innodb_flush_log_at_trx_commit=0
start slave;
Master_Host: 192.168.179.103
Master_User: replication
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.006189
Read_Master_Log_Pos: 54032040
Relay_Log_File: mysqld-relay-bin.001144
Relay_Log_Pos: 104857605
Relay_Master_Log_File: mysql-bin.006105
Slave_IO_Running: No
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1594
Last_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
Skip_Counter: 0
Exec_Master_Log_Pos: 61589175
Relay_Log_Space: 8484259756
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1594
Last_SQL_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
Replicate_Ignore_Server_Ids:
Master_Server_Id: 12022
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: No
Looking at the relay file:
ls -al mysqld-relay-bin.001144
-rw-rw---- 1 mysql mysql 104857705 Dec 19 03:23 mysqld-relay-bin.001144
mysqlbinlog --start-position 104857605 mysqld-relay-bin.001144 | more
ERROR: /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
Error in Log_event::read_log_event(): 'Event too big', data_len: 1163019592, event_type: 48
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
ERROR: DELIMITER /*!*/;
Could not read entry at offset 104857605: Error in log format or read error.# at 4
#141215 0:30:57 server id 12025 end_log_pos 248 Start: binlog v 4, server v 10.0.15-MariaDB-1~wheezy-log created 141215 0:30:57
BINLOG '
wSuOVA/5LgAA9AAAAPgAAAAAAAQAMTAuMC4xNS1NYXJpYURCLTF+d2...
AAAAAAAAAAAEEwQAj+72WQ==
'/*!*/;
# at 291
#141218 23:29:46 server id 12022 end_log_pos 248 Start: binlog v 4, server v 10.0.15-MariaDB-1~wheezy-log created 141218 23:29:46
BINLOG '
amOTVA/2LgAA9AAAAPgAAAAAAAQAM.....
AAAAAAAAAAAEEwQA8pOnBg==
'/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
I concluded its 100 bytes from the end of the file, going to reposition to the the next file.
MariaDB [(none)]> change master to relay_log_file='mysqld-relay-bin.001145',RELAY_LOG_POS=0;
start slave
I soon (<2 mins) there after found a duplicate key error on an insert which I suspected was related and quickly clear it. set sql_log_bin=0; REPLACE INTO seller_counters (seller_id, category_id, counter) VALUES (36239, 321, 1); set global sql_slave_skip_counter=1; start slave
The show slave status soon (<2 mins) after found another error:
Relay_Log_File: mysqld-relay-bin.001147
Relay_Log_Pos: 7901913
Relay_Master_Log_File: mysql-bin.006189
Slave_IO_Running: Yes
Slave_SQL_Running: No
(sorry I didn't record the full output at the time)
So from position:
Relay_Log_File: mysqld-relay-bin.001144
Relay_Log_Pos: 104857605
Relay_Master_Log_File: mysql-bin.006105
Exec_Master_Log_Pos: 9258445
until a very short time later after a change master to changed the relay_log_file/pos to the next file and with a 100 bytes changed the new position is:
Relay_Log_File: mysqld-relay-bin.001147
Relay_Log_Pos: 7901913
Relay_Master_Log_File: mysql-bin.006189
The relay log files are 100M though though probably a logrotate not using shared scripts the following pattern occurs about 3 hourly
-rw-rw---- 1 mysql mysql 104857912 Dec 21 00:33 mysqld-relay-bin.001187 -rw-rw---- 1 mysql mysql 569 Dec 21 00:33 mysqld-relay-bin.001188 -rw-rw---- 1 mysql mysql 341 Dec 21 00:33 mysqld-relay-bin.001189
The master log files are more strict on 100M an only differ on on what appears to be a daily logrotate.
The relay log file has gone from 001144 to 001147 in the same time Relay_Master_Log_File has gone from 006105 (19th December) to mysql-bin.006189 (30 December, which is the current binlog on the master).
binlog_format | STATEMENT
A few hours later I decided to capture this right without gaps.
Take a stopped slave position and ignoring this error (i'll put a new bug for it).
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 1. row ***************************
Slave_IO_State:
Master_Host: 192.168.179.103
Master_User: replication
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.006189
Read_Master_Log_Pos: 96248700
Relay_Log_File: mysqld-relay-bin.001150
Relay_Log_Pos: 94786159
Relay_Master_Log_File: mysql-bin.006189
Slave_IO_Running: No
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1594
Last_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbin
log' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you
want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
Skip_Counter: 0
Exec_Master_Log_Pos: 94785872
Relay_Log_Space: 8316760144
- 1. row ***************************
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
change master to Relay_Log_File='mysqld-relay-bin.001150',relay_log_pos=94785831; show slave status;
Slave_IO_State:
Master_Host: 192.168.179.103
Master_User: replication
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.006189
Read_Master_Log_Pos: 96248700
Relay_Log_File: mysqld-relay-bin.001150
Relay_Log_Pos: 94785831
Relay_Master_Log_File: mysql-bin.006189
Slave_IO_Running: No
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 96248700
Relay_Log_Space: 8316760144
Until_Condition: None
so changing Relay_Log_Pos: 94785831
to Relay_Log_Pos: 94786159
changed
Exec_Master_Log_Pos: 94785872
Exec_Master_Log_Pos: 96248700
so changing the relay_log_pos back 300 bytes made the exec_master_log_pos jump forward 1.5M.
Looks like this is grabbing a current master position when then relay_log_pos/file is changed.
Gliffy Diagrams
Attachments
Issue Links
Activity
- All
- Comments
- Work Log
- History
- Activity
- Transitions
This sounds a lot like
MDEV-7237, which is fixed in 10.0.16.But if 10.0.16 (when it is released) does not fix it, then please feel free to re-open this bug with updated info.