Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-4933

Misleading error message upon a wrong sequence of events in the master log

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 10.0.10
    • Fix Version/s: N/A
    • Component/s: None
    • Labels:

      Description

      There was a complaint about this message on the maria-developers list, but I don't keep track of what has been fixed and what not from that discussion, so I'll file it just in case. Please feel free to discard if it's not needed.

      When a slave operating in a strict GTID mode encounters a wrong sequence of events, it produces the error message:
      "The binlog on the master is missing the GTID 0-1-8 requested by the slave (even though both a prior and a subsequent sequence number does exist), and GTID strict mode is enabled".
      Sometimes the message is wrong. In the example below, the GTID 0-1-8 exists in the binary log.

      # Error message:
      Got fatal error 1236 from master when reading data from binary log: 'The binlog on the master is missing the GTID 0-1-8 requested by the slave (even though both a prior and a subsequent sequence number does exist), and GTID strict mode is enabled'
      connection server_1;
      # Actual binlog events on the master:
      show binlog events;
      Log_name	Pos	Event_type	Server_id	End_log_pos	Info
      master-bin.000001	4	Format_desc	1	248	Server ver: 10.0.3-MariaDB-debug-log, Binlog ver: 4
      master-bin.000001	248	Gtid_list	1	273	[]
      master-bin.000001	273	Binlog_checkpoint	1	313	master-bin.000001
      master-bin.000001	313	Gtid	1	351	GTID 0-1-1
      master-bin.000001	351	Query	1	451	use `test`; create table t1 (i int) engine=InnoDB
      master-bin.000001	451	Gtid	1	489	BEGIN GTID 0-1-2
      master-bin.000001	489	Query	1	581	use `test`; insert into t1 values (1),(2)
      master-bin.000001	581	Xid	1	608	COMMIT /* xid=104 */
      master-bin.000001	608	Gtid	1	646	BEGIN GTID 0-1-10
      master-bin.000001	646	Query	1	738	use `test`; insert into t1 values (3),(4)
      master-bin.000001	738	Xid	1	765	COMMIT /* xid=106 */
      master-bin.000001	765	Gtid	1	803	BEGIN GTID 0-1-8
      master-bin.000001	803	Query	1	895	use `test`; insert into t1 values (5),(6)
      master-bin.000001	895	Xid	1	922	COMMIT /* xid=108 */
      

      The error itself is expected, because the sequence numbers are not monotonically increasing, it's just the diagnostics that is questionable here.

      gtid1.test

      
      --source include/have_innodb.inc
      --let $rpl_topology=1->2
      --source include/rpl_init.inc
      
      --connection server_2
      --source include/stop_slave.inc
      change master to master_use_gtid=current_pos;
      --source include/start_slave.inc
      
      --enable_connect_log
      
      --connection server_1
      create table t1 (i int) engine=InnoDB;
      insert into t1 values (1),(2);
      set gtid_seq_no=10;
      insert into t1 values (3),(4);
      set gtid_seq_no=8;
      insert into t1 values (5),(6);
      --save_master_pos
      
      --connection server_2
      --sync_with_master
      --source include/stop_slave.inc
      set global gtid_strict_mode=1;
      #--source include/start_slave.inc
      start slave;
      --let $slave_io_errno= 1236
      --source include/wait_for_slave_io_error.inc
      
      --let $err_text = query_get_value(SHOW SLAVE STATUS, Last_IO_Error, 1)
      
      --echo # Error message:
      --echo $err_text
      
      --connection server_1
      --echo # Actual binlog events on the master:
      show binlog events;
      

      gtid1.cnf

      !include suite/rpl/rpl_1slave_base.cnf
      !include include/default_client.cnf
      
      [mysqld.1]
      log-slave-updates
      
      [mysqld.2]
      log-slave-updates
      

      bzr version-info

      revision-id: sanja@montyprogram.com-20130820135351-6wjeq3u31waunp7c
      revno: 3680
      branch-nick: 10.0-base
      

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

              Hide
              knielsen Kristian Nielsen added a comment -

              Yeah, I wonder how to word that error message in a concise yet clear and
              correct way.

              What actually happens is that the master is first scanning the list of
              binlogs, looking for the most recent that has a smaller seq_no in the last
              GTID in that file with same domain and server_id.

              Then it scans the binlog file, looking for the requested GTID, and aborts with
              this error if it instead finds one with higher seq_no and same
              domain/server_id, as this indicates a "hole" in the binlog (or severe
              out-of-order as in your example).

              Would this be a better error message?

              "ERROR: While scanning binlogs for slave start position, found GTID 0-1-10
              instead of GTID 0-1-8 that slave requested, and GTID strict mode is
              enabled."

              Show
              knielsen Kristian Nielsen added a comment - Yeah, I wonder how to word that error message in a concise yet clear and correct way. What actually happens is that the master is first scanning the list of binlogs, looking for the most recent that has a smaller seq_no in the last GTID in that file with same domain and server_id. Then it scans the binlog file, looking for the requested GTID, and aborts with this error if it instead finds one with higher seq_no and same domain/server_id, as this indicates a "hole" in the binlog (or severe out-of-order as in your example). Would this be a better error message? "ERROR: While scanning binlogs for slave start position, found GTID 0-1-10 instead of GTID 0-1-8 that slave requested, and GTID strict mode is enabled."

                People

                • Assignee:
                  knielsen Kristian Nielsen
                  Reporter:
                  elenst Elena Stepanova
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  1 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved: