Details
Description
There was a complaint about this message on the maria-developers list, but I don't keep track of what has been fixed and what not from that discussion, so I'll file it just in case. Please feel free to discard if it's not needed.
When a slave operating in a strict GTID mode encounters a wrong sequence of events, it produces the error message:
"The binlog on the master is missing the GTID 0-1-8 requested by the slave (even though both a prior and a subsequent sequence number does exist), and GTID strict mode is enabled".
Sometimes the message is wrong. In the example below, the GTID 0-1-8 exists in the binary log.
# Error message: Got fatal error 1236 from master when reading data from binary log: 'The binlog on the master is missing the GTID 0-1-8 requested by the slave (even though both a prior and a subsequent sequence number does exist), and GTID strict mode is enabled' connection server_1; # Actual binlog events on the master: show binlog events; Log_name Pos Event_type Server_id End_log_pos Info master-bin.000001 4 Format_desc 1 248 Server ver: 10.0.3-MariaDB-debug-log, Binlog ver: 4 master-bin.000001 248 Gtid_list 1 273 [] master-bin.000001 273 Binlog_checkpoint 1 313 master-bin.000001 master-bin.000001 313 Gtid 1 351 GTID 0-1-1 master-bin.000001 351 Query 1 451 use `test`; create table t1 (i int) engine=InnoDB master-bin.000001 451 Gtid 1 489 BEGIN GTID 0-1-2 master-bin.000001 489 Query 1 581 use `test`; insert into t1 values (1),(2) master-bin.000001 581 Xid 1 608 COMMIT /* xid=104 */ master-bin.000001 608 Gtid 1 646 BEGIN GTID 0-1-10 master-bin.000001 646 Query 1 738 use `test`; insert into t1 values (3),(4) master-bin.000001 738 Xid 1 765 COMMIT /* xid=106 */ master-bin.000001 765 Gtid 1 803 BEGIN GTID 0-1-8 master-bin.000001 803 Query 1 895 use `test`; insert into t1 values (5),(6) master-bin.000001 895 Xid 1 922 COMMIT /* xid=108 */
The error itself is expected, because the sequence numbers are not monotonically increasing, it's just the diagnostics that is questionable here.
gtid1.test
--source include/have_innodb.inc
--let $rpl_topology=1->2
--source include/rpl_init.inc
--connection server_2
--source include/stop_slave.inc
change master to master_use_gtid=current_pos;
--source include/start_slave.inc
--enable_connect_log
--connection server_1
create table t1 (i int) engine=InnoDB;
insert into t1 values (1),(2);
set gtid_seq_no=10;
insert into t1 values (3),(4);
set gtid_seq_no=8;
insert into t1 values (5),(6);
--save_master_pos
--connection server_2
--sync_with_master
--source include/stop_slave.inc
set global gtid_strict_mode=1;
#--source include/start_slave.inc
start slave;
--let $slave_io_errno= 1236
--source include/wait_for_slave_io_error.inc
--let $err_text = query_get_value(SHOW SLAVE STATUS, Last_IO_Error, 1)
--echo # Error message:
--echo $err_text
--connection server_1
--echo # Actual binlog events on the master:
show binlog events;
gtid1.cnf
!include suite/rpl/rpl_1slave_base.cnf !include include/default_client.cnf [mysqld.1] log-slave-updates [mysqld.2] log-slave-updates
bzr version-info
revision-id: sanja@montyprogram.com-20130820135351-6wjeq3u31waunp7c revno: 3680 branch-nick: 10.0-base
Gliffy Diagrams
Attachments
Issue Links
- relates to
-
MDEV-26 Global transaction ID
-
- Closed
-
Activity
- All
- Comments
- Work Log
- History
- Activity
- Transitions
Yeah, I wonder how to word that error message in a concise yet clear and
correct way.
What actually happens is that the master is first scanning the list of
binlogs, looking for the most recent that has a smaller seq_no in the last
GTID in that file with same domain and server_id.
Then it scans the binlog file, looking for the requested GTID, and aborts with
this error if it instead finds one with higher seq_no and same
domain/server_id, as this indicates a "hole" in the binlog (or severe
out-of-order as in your example).
Would this be a better error message?
"ERROR: While scanning binlogs for slave start position, found GTID 0-1-10
instead of GTID 0-1-8 that slave requested, and GTID strict mode is
enabled."