GTID binlog indexing

Description

Current GTID code needs to scan one binlog file from the beginning when a
slave connects, to find the place to start replicating. If slave reconnects
are frequent, this can be a performance regression, as in earlier versions
slaves could immediate seek to the supplied file offset.

To fix this problem, indexing should be done on the binlog files, allowing to
quickly locate any GTID in the binlog file. As an added bonus, this would
allow to detect if old-style replication tries to connect at an incorrect file
offset (eg. in the middle of an event), avoiding sending potentially corrupted
events.

The index could be an extra file master-bin.000001.idx written in parallel
with the binlog file. There is no need to flush or sync the file at every
binlog write, as it can be recovered easily in case of crash or code can fall
back to scanning the corresponding binlog file.

The index would be page-based, allowing a connecting slave to do binary search
to find the desired location in the binlog to start replication. The file
would contain an ordered sequence of GTID binlog states with their
corresponding start offset into the associated binlog file.

The connecting slave would then binary-search for its start position in the
index, and this way be able to jump directly to the right start position in
the binlog file, without needing to scan that binlog from the start. This
would greatly improve performance when many slaves connect simultaneously.

There is no need to include every position in the index. We can write say one
in every 10 transactions into the index; a connecting slave will then lookup
the closest matching position in the index and at most need to skip over 10
transactions in the binlog. In general, we can keep track of the size of
binlog written and index written, and write only a fraction of transactions
into the index to ensure that the ratio of index size to binlog size does not
exceed some appropriate number (eg. 2% or something).

To further reduce the index size, it could be "compressed" by omitting from entries those (domain_id, server_id) combinations that do not change. Typically, there can be many distinct such values in a binlog file, but only a few of them are likely to change within one given file.

Environment

None

Status

Assignee

Kristian Nielsen

Reporter

Kristian Nielsen

Labels

External issue ID

None

External issue ID

None

Priority

Critical