Uploaded image for project: 'MariaDB Server'
  1. MDEV-5657

General cleanup of parallel replication event scheduling (was: Overlap (group) commit with next event group in parallel replication)

    Details

    • Type: Task
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Fix Version/s: 10.0.9
    • Component/s: None
    • Labels:
      None

      Description

      Actually, this task is more of a general cleanup of the part of parallel
      replication that handles scheduling of events to worker threads. The old code
      has a large number of issues; the scheduling code is unnecessarily
      complicated, and there are many corner cases, for example related to error
      handling or slave stop, that are not handled correctly and which can lead to
      various forms of corruption.

      The below description is the user-visible part of the changes, but most of the
      changes are actually needed to fix bugs that were also present in the old code
      anyway.


      In parallel replication, we record which event groups group-committed together
      on the master, and are thus able to apply them in parallel on the slave.

      Eg. if we have on the master A1 A2 A3 that group commit together, followed by
      B1 and B2, we will run A1, A2, and A3 in parallel in each their own
      worker.

      But B1 will be queued for the same worker as A3, and B2, while queued for a
      new worker, will wait for A3 to complete before it will start.

      But actually, this is too pessimistic. B2 can start as soon as A1, A2, and A3
      become ready to commit. Similarly, B1, could be spawned in a new worker and
      also be allowed to start as soon as all event groups in the previous group
      commit reach the commit stage.

      This could be a big win if the slave is running with --log-slave-updates,
      --sync-binlog=1, and/or innodb_flush_log_at_trx_commit=1; the slow fsync at
      commit will not delay the execution of further events, and commit steps run in
      parallel which gives more opportunity for group commit on the slave.

        Attachments

          Activity

            People

            • Assignee:
              knielsen Kristian Nielsen
              Reporter:
              knielsen Kristian Nielsen
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: