Details

      Description

      Parallel replication in 10.0 relies either on sufficient parallelism being
      available during master commit, or alternatively on user-annotated parallelism
      with GTID domain ids. These may not be sufficient to achive good parallelism
      on a slave.

      The original approach in parallel replication was to only run in parallel
      transactions that were known to be able to safely replicate in parallel, due
      to group-committing together on the master. However, it turned out that there
      were some corner cases where it could be not safe even in spite of this. So
      a general solution was implemented that allows to handle and recover from an
      attempt to do non-safe parallel replication, by detecting a deadlock in commit
      order and retrying the problem transaction.

      With this general solution, it actually becomes safe to attempt to replicate
      any transactions in parallel, as long as those transactions can be rolled
      back and re-tried (eg. InnoDB/XtraDB DML). This opens the way for
      speculatively replicating in parallel on the slave in an attempt to get more
      parallelism. We can simply queue transactions in parallel regardless of
      whether they have same commit id from the master. If there are no conflicts,
      then great, parallelism will be improved. If there is a conflict, the enforced
      commit order will cause it to be detected as a deadlock, and the later
      transaction will be rolled back and retried.

      To avoid excessive rollback and retry, and to avoid attempts to roll back
      non-transactional updates, we could have some simple heuristics about when to
      attempt the speculative parallel apply. For example:

      • Annotate transactions on the master (with a flag in the GTID event) that
        are pure InnoDB DML, and only attempt to run those in parallel
        speculatively on the slave. Or alternatively, detect this during
        open_tables(), and let events wait for prior transactions if they touch
        non-transactional table.
      • Annotate on the master transactions that ended up having row lock waits on
        other transactions, indicating a potential conflict. Such transactions
        might be likely to also conflict on the slave, so might be better to let
        wait for prior transactions, rather than try speculative parallel apply.
      • If the number of rows affected becomes large, pause the replicating large
        transaction and wait for prior transactions to complete first, to avoid
        having to do a large rollback (which is expensive in InnoDB).

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            knielsen Kristian Nielsen added a comment -

            There is a new issue that one needs to aware of if doing this kind of
            speculative parallel replication. Consider the following two transactions:

            T1: DELETE FROM t1 WHERE a=1
            T2: INSERT INTO t1 SET a=1

            Suppose we try to run them in parallel. Then T2 might run before T1 has had
            time to set a row lock. Then T2 can fail with a duplicate key error. Unlike a
            deadlock error, duplicate key error is not normally considered something that
            would trigger a retry of the failing transaction.

            So if we run T2 speculatively, then we probably need to treat any error as
            needing a retry of the transaction. Before the first retry, I think we should
            then execute wait_for_prior_commit(), so that T1 gets to complete its commit
            before we attempt the retry. This way, only in the first attempt do we need to
            retry on all errors; if a retry fails, only a deadlock (or similar temporary
            error) will cause another retry to be attempted.

            Show
            knielsen Kristian Nielsen added a comment - There is a new issue that one needs to aware of if doing this kind of speculative parallel replication. Consider the following two transactions: T1: DELETE FROM t1 WHERE a=1 T2: INSERT INTO t1 SET a=1 Suppose we try to run them in parallel. Then T2 might run before T1 has had time to set a row lock. Then T2 can fail with a duplicate key error. Unlike a deadlock error, duplicate key error is not normally considered something that would trigger a retry of the failing transaction. So if we run T2 speculatively, then we probably need to treat any error as needing a retry of the transaction. Before the first retry, I think we should then execute wait_for_prior_commit(), so that T1 gets to complete its commit before we attempt the retry. This way, only in the first attempt do we need to retry on all errors; if a retry fails, only a deadlock (or similar temporary error) will cause another retry to be attempted.
            Hide
            arjen Arjen Lentz added a comment -

            It should of course be implemented such that it works on any engine, not just InnoDB.
            This includes other transactional engines such as TokuDB.

            How is the XA situation... MariaDB server should be able to run a transaction that uses both InnoDB and for instance TokuDB, and have an all-or-nothing two-phase commit across both/all transactional engines and the binlog. This should work because of earlier work by Serg for XA between InnoDB and the binlog, and later including PBXT.

            Show
            arjen Arjen Lentz added a comment - It should of course be implemented such that it works on any engine, not just InnoDB. This includes other transactional engines such as TokuDB. How is the XA situation... MariaDB server should be able to run a transaction that uses both InnoDB and for instance TokuDB, and have an all-or-nothing two-phase commit across both/all transactional engines and the binlog. This should work because of earlier work by Serg for XA between InnoDB and the binlog, and later including PBXT.
            Hide
            knielsen Kristian Nielsen added a comment -

            I've pushed the current code here and will continue to maintain it in that tree:

            lp:~maria-captains/maria/10.0-mdev6676

            Show
            knielsen Kristian Nielsen added a comment - I've pushed the current code here and will continue to maintain it in that tree: lp:~maria-captains/maria/10.0-mdev6676
            Hide
            knielsen Kristian Nielsen added a comment -

            @Arjen: The code is implemented so that it can work with other transactional storage engines. Some small additional support is required for an engine to work with speculative parallel replication, so that lock waits that would conflict with replication commit order are detected and flagged as deadlocks.

            The XA situation is as you describe. Multi-engine transactions use prepare-commit to make sure that everything or nothing is committed to engines and binlog.

            Show
            knielsen Kristian Nielsen added a comment - @Arjen: The code is implemented so that it can work with other transactional storage engines. Some small additional support is required for an engine to work with speculative parallel replication, so that lock waits that would conflict with replication commit order are detected and flagged as deadlocks. The XA situation is as you describe. Multi-engine transactions use prepare-commit to make sure that everything or nothing is committed to engines and binlog.
            Hide
            knielsen Kristian Nielsen added a comment -

            Pushed to 10.1.3

            Show
            knielsen Kristian Nielsen added a comment - Pushed to 10.1.3

              People

              • Assignee:
                knielsen Kristian Nielsen
                Reporter:
                knielsen Kristian Nielsen
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: