Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-4662

InnoDB: Use of large externally-stored fields makes crash recovery lose data

    Details

      Description

      When too-large blob fields are used, this is noted to the administrator in a rather innocuous looking message:

      InnoDB: ERROR: the age of the last checkpoint is XXX,
      InnoDB: which exceeds the log group capacity YYY.
      InnoDB: If you are using big BLOB or TEXT rows, you must set the
      InnoDB: combined size of log files at least 10 times bigger than the
      InnoDB: largest such row.

      I would have expected that this means that InnoDB is stalling in order to make more space in its redo logs. However, what it actually means is that InnoDB has overwritten its most recent checkpoint in its redo logs. This compromises crash recovery, potentially causing data loss (or even metadata loss, such as writes to data dictionary tables or system tablespace data). This is easily reproducible using the attached test case.

      This appears to happen because externally-stored fields are always written in a single batch to the redo logs, all while holding the log mutex, thus making it impossible to checkpoint during that write. There are several possible solutions to this:

      1. Allow flushing to "catch up" and checkpoint during large external field writes. This will involve releasing the log mutex during the write, which is likely complex.

      2. Disallow (at least optionally) such large writes. Disallowing external field writes which sum to more than 10% of the total redo log space will in theory prevent this problem, because log_free_check() is called before the write of the external field, and (although it has some races) it should ensure that 10% of the log space is available before starting the write.

      This issue exists in all versions of MySQL and MariaDB.

        Issue Links

          Activity

          Hide
          elenst Elena Stepanova added a comment -

          I think it was even documented somewhere, wasn't it?
          Or maybe it was an old bug report...

          Show
          elenst Elena Stepanova added a comment - I think it was even documented somewhere, wasn't it? Or maybe it was an old bug report...
          Hide
          jeremycole Jeremy Cole added a comment -

          Elena: There have been a few bug reports about this, but none of them have touched on the core issue that this compromises the actual ACID properties of InnoDB itself. Personally I would rather have InnoDB assert itself with "durability may be compromised by continuing" rather than its current situation which could best be described as "keep going and hope for the best".

          Most of the bug reports have been either "Can't repeat" or have had people increase their logs and "it doesn't happen anymore". This bug report was a first effort to discuss the actual problem and propose some fixes.

          Show
          jeremycole Jeremy Cole added a comment - Elena: There have been a few bug reports about this, but none of them have touched on the core issue that this compromises the actual ACID properties of InnoDB itself. Personally I would rather have InnoDB assert itself with "durability may be compromised by continuing" rather than its current situation which could best be described as "keep going and hope for the best". Most of the bug reports have been either "Can't repeat" or have had people increase their logs and "it doesn't happen anymore". This bug report was a first effort to discuss the actual problem and propose some fixes.
          Hide
          jeremycole Jeremy Cole added a comment -
          Show
          jeremycole Jeremy Cole added a comment - Also reported as: http://bugs.mysql.com/bug.php?id=69477
          Hide
          elenst Elena Stepanova added a comment -

          Right, thanks. I just had a vague remembrance and was wondering if it's the same issue.
          Now when you mentioned increasing the logs, I recalled it – the problem happened in tests a lot, and the usual recipe was "the log size is not enough for your flow, set it to a higher value" (and I think the default value got increased eventually).
          I certainly agree it would be good to fix it properly.

          Show
          elenst Elena Stepanova added a comment - Right, thanks. I just had a vague remembrance and was wondering if it's the same issue. Now when you mentioned increasing the logs, I recalled it – the problem happened in tests a lot, and the usual recipe was "the log size is not enough for your flow, set it to a higher value" (and I think the default value got increased eventually). I certainly agree it would be good to fix it properly.
          Hide
          jeremycole Jeremy Cole added a comment -

          Note that this was the other day marked as fixed in upstream MySQL Bug 69477. Perhaps it should be merged in once they release it.

          Show
          jeremycole Jeremy Cole added a comment - Note that this was the other day marked as fixed in upstream MySQL Bug 69477. Perhaps it should be merged in once they release it.
          Hide
          elenst Elena Stepanova added a comment -

          Upstream bugfix in 5.6.20:

          revno: 5958
          revision-id: annamalai.gurusami@oracle.com-20140522155303-y5bvfo4sq0tdls98
          parent: marko.makela@oracle.com-20140522115539-2yijjno0m7n65i7o
          committer: Annamalai Gurusami <annamalai.gurusami@oracle.com>
          branch nick: mysql-5.6
          timestamp: Thu 2014-05-22 21:23:03 +0530
          message:
            Bug #16963396 INNODB: USE OF LARGE EXTERNALLY-STORED FIELDS MAKES CRASH
            RECOVERY LOSE DATA
            
            Problem:
            
            When too-large blob fields are used, InnoDB overwrites its most recent
            checkpoint in its redo logs.
            
            Solution:
            
            Ensure that the total blob length does not exceed 10% of the redo log file
            size.
            
            rb#5399 approved by Marko, Nuno, Manish. 
            Venkat also contributed to patch (in replication related test case).
          

          Not reproducible on the current 10.0 (10.0.14+) tree, which is expected since InnoDB 5.6.20 was merged into 10.0.14.

          Show
          elenst Elena Stepanova added a comment - Upstream bugfix in 5.6.20: revno: 5958 revision-id: annamalai.gurusami@oracle.com-20140522155303-y5bvfo4sq0tdls98 parent: marko.makela@oracle.com-20140522115539-2yijjno0m7n65i7o committer: Annamalai Gurusami <annamalai.gurusami@oracle.com> branch nick: mysql-5.6 timestamp: Thu 2014-05-22 21:23:03 +0530 message: Bug #16963396 INNODB: USE OF LARGE EXTERNALLY-STORED FIELDS MAKES CRASH RECOVERY LOSE DATA Problem: When too-large blob fields are used, InnoDB overwrites its most recent checkpoint in its redo logs. Solution: Ensure that the total blob length does not exceed 10% of the redo log file size. rb#5399 approved by Marko, Nuno, Manish. Venkat also contributed to patch (in replication related test case). Not reproducible on the current 10.0 (10.0.14+) tree, which is expected since InnoDB 5.6.20 was merged into 10.0.14.

            People

            • Assignee:
              Unassigned
              Reporter:
              jeremycole Jeremy Cole
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: