Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-4662

InnoDB: Use of large externally-stored fields makes crash recovery lose data

    Details

      Description

      When too-large blob fields are used, this is noted to the administrator in a rather innocuous looking message:

      InnoDB: ERROR: the age of the last checkpoint is XXX,
      InnoDB: which exceeds the log group capacity YYY.
      InnoDB: If you are using big BLOB or TEXT rows, you must set the
      InnoDB: combined size of log files at least 10 times bigger than the
      InnoDB: largest such row.

      I would have expected that this means that InnoDB is stalling in order to make more space in its redo logs. However, what it actually means is that InnoDB has overwritten its most recent checkpoint in its redo logs. This compromises crash recovery, potentially causing data loss (or even metadata loss, such as writes to data dictionary tables or system tablespace data). This is easily reproducible using the attached test case.

      This appears to happen because externally-stored fields are always written in a single batch to the redo logs, all while holding the log mutex, thus making it impossible to checkpoint during that write. There are several possible solutions to this:

      1. Allow flushing to "catch up" and checkpoint during large external field writes. This will involve releasing the log mutex during the write, which is likely complex.

      2. Disallow (at least optionally) such large writes. Disallowing external field writes which sum to more than 10% of the total redo log space will in theory prevent this problem, because log_free_check() is called before the write of the external field, and (although it has some races) it should ensure that 10% of the log space is available before starting the write.

      This issue exists in all versions of MySQL and MariaDB.

        Gliffy Diagrams

          Issue Links

            Activity

            Hide
            elenst Elena Stepanova added a comment -

            I think it was even documented somewhere, wasn't it?
            Or maybe it was an old bug report...

            Show
            elenst Elena Stepanova added a comment - I think it was even documented somewhere, wasn't it? Or maybe it was an old bug report...
            Hide
            jeremycole Jeremy Cole added a comment -

            Elena: There have been a few bug reports about this, but none of them have touched on the core issue that this compromises the actual ACID properties of InnoDB itself. Personally I would rather have InnoDB assert itself with "durability may be compromised by continuing" rather than its current situation which could best be described as "keep going and hope for the best".

            Most of the bug reports have been either "Can't repeat" or have had people increase their logs and "it doesn't happen anymore". This bug report was a first effort to discuss the actual problem and propose some fixes.

            Show
            jeremycole Jeremy Cole added a comment - Elena: There have been a few bug reports about this, but none of them have touched on the core issue that this compromises the actual ACID properties of InnoDB itself. Personally I would rather have InnoDB assert itself with "durability may be compromised by continuing" rather than its current situation which could best be described as "keep going and hope for the best". Most of the bug reports have been either "Can't repeat" or have had people increase their logs and "it doesn't happen anymore". This bug report was a first effort to discuss the actual problem and propose some fixes.
            Hide
            jeremycole Jeremy Cole added a comment -
            Show
            jeremycole Jeremy Cole added a comment - Also reported as: http://bugs.mysql.com/bug.php?id=69477
            Hide
            elenst Elena Stepanova added a comment -

            Right, thanks. I just had a vague remembrance and was wondering if it's the same issue.
            Now when you mentioned increasing the logs, I recalled it – the problem happened in tests a lot, and the usual recipe was "the log size is not enough for your flow, set it to a higher value" (and I think the default value got increased eventually).
            I certainly agree it would be good to fix it properly.

            Show
            elenst Elena Stepanova added a comment - Right, thanks. I just had a vague remembrance and was wondering if it's the same issue. Now when you mentioned increasing the logs, I recalled it – the problem happened in tests a lot, and the usual recipe was "the log size is not enough for your flow, set it to a higher value" (and I think the default value got increased eventually). I certainly agree it would be good to fix it properly.
            Hide
            jeremycole Jeremy Cole added a comment -

            Note that this was the other day marked as fixed in upstream MySQL Bug 69477. Perhaps it should be merged in once they release it.

            Show
            jeremycole Jeremy Cole added a comment - Note that this was the other day marked as fixed in upstream MySQL Bug 69477. Perhaps it should be merged in once they release it.
            Hide
            elenst Elena Stepanova added a comment -

            Upstream bugfix in 5.6.20:

            revno: 5958
            revision-id: annamalai.gurusami@oracle.com-20140522155303-y5bvfo4sq0tdls98
            parent: marko.makela@oracle.com-20140522115539-2yijjno0m7n65i7o
            committer: Annamalai Gurusami <annamalai.gurusami@oracle.com>
            branch nick: mysql-5.6
            timestamp: Thu 2014-05-22 21:23:03 +0530
            message:
              Bug #16963396 INNODB: USE OF LARGE EXTERNALLY-STORED FIELDS MAKES CRASH
              RECOVERY LOSE DATA
              
              Problem:
              
              When too-large blob fields are used, InnoDB overwrites its most recent
              checkpoint in its redo logs.
              
              Solution:
              
              Ensure that the total blob length does not exceed 10% of the redo log file
              size.
              
              rb#5399 approved by Marko, Nuno, Manish. 
              Venkat also contributed to patch (in replication related test case).
            

            Not reproducible on the current 10.0 (10.0.14+) tree, which is expected since InnoDB 5.6.20 was merged into 10.0.14.

            Show
            elenst Elena Stepanova added a comment - Upstream bugfix in 5.6.20: revno: 5958 revision-id: annamalai.gurusami@oracle.com-20140522155303-y5bvfo4sq0tdls98 parent: marko.makela@oracle.com-20140522115539-2yijjno0m7n65i7o committer: Annamalai Gurusami <annamalai.gurusami@oracle.com> branch nick: mysql-5.6 timestamp: Thu 2014-05-22 21:23:03 +0530 message: Bug #16963396 INNODB: USE OF LARGE EXTERNALLY-STORED FIELDS MAKES CRASH RECOVERY LOSE DATA Problem: When too-large blob fields are used, InnoDB overwrites its most recent checkpoint in its redo logs. Solution: Ensure that the total blob length does not exceed 10% of the redo log file size. rb#5399 approved by Marko, Nuno, Manish. Venkat also contributed to patch (in replication related test case). Not reproducible on the current 10.0 (10.0.14+) tree, which is expected since InnoDB 5.6.20 was merged into 10.0.14.

              People

              • Assignee:
                Unassigned
                Reporter:
                jeremycole Jeremy Cole
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: