Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6680

Performance of domain_parallel replication is disappointing

    Details

      Description

      In March, Axel performed some benchmarks on parallel replication, as well as
      comparison with MySQL 5.6 parallel replication.

      MySQL 5.6 parallel replication corresponds more or less to setting domain_id
      to different values in MariaDB. Axel's benchmarks showed though disappointing
      performance compared to MySQL for this case, where one would expect similar
      performance in either.

      This needs to be investigated. It seems likely that there is a bottleneck or
      locking mistake somewhere in the code, as this has not yet been much tested.

      One possible explanation is related to the --slave-parallel-max-queued
      parameter. When the SQL driver thread has queued this much events for a worker
      thread, it will wait for more room in the queue for that thread. However, due
      to batching of updates, that worker thread might not signal that the queue has
      more room until it has completely emptied the queue. Meanwhile, other worker
      threads will be stalled if they happen to complete their queue faster.

      [Since Axel's benchmark works on an already generated master binlog, this
      condition is likely to be hit]

      This needs to be fixed somehow, for example simply by more frequently
      signalling when events have been removed from the queue. Like whenever 1/4 of
      the queue has been emptied or something (signalling for every event drained is
      likely to be too expensive in terms of locking overhead).

      There might be other issues as well, needs to be investigated.

      Here is a pointer into the mail thread on maria-developers@ where this was
      discussed:

      https://lists.launchpad.net/maria-developers/msg07089.html

      (I thought I had filed this bug already, but did not find it in search, sorry
      if it is a duplicate).

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            danblack Daniel Black added a comment - - edited

            While it seems there can be more that slave_parallel_max_queued used for various reasons in the documentation can we have a global status variable to indicate how much is actually in the queue?

            I had a great success in increasing this to 512M (mixed replication where there are some fairly heavy multi row updates).

            Show
            danblack Daniel Black added a comment - - edited While it seems there can be more that slave_parallel_max_queued used for various reasons in the documentation can we have a global status variable to indicate how much is actually in the queue? I had a great success in increasing this to 512M (mixed replication where there are some fairly heavy multi row updates).
            Show
            knielsen Kristian Nielsen added a comment - Pushed to 10.0.15: http://lists.askmonty.org/pipermail/commits/2014-November/006975.html

              People

              • Assignee:
                knielsen Kristian Nielsen
                Reporter:
                knielsen Kristian Nielsen
              • Votes:
                2 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: