Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 10.0.13
    • Fix Version/s: 5.5.40, 10.0.14
    • Component/s: None
    • Labels:
      None
    • Environment:
      power8, RH6.5

      Description

      NB: Fix for this bug also present in Stewart Smith' patchset: memory_barrier-experimental_5.6.4.diff.

      From errorlog:

      2014-07-31 21:02:00 ff6fb757190  InnoDB: Assertion failure in thread 17553455149456 in file sync0rw.cc line 690
      InnoDB: Failing assertion: !lock->recursive
      InnoDB: We intentionally generate a memory trap.
      ...
      stack_bottom = 0xff6fb756610 thread_stack 0x48000
      :0(000000ca.plt_call.MD5_Init)[0x109b476c]
      :0(000000ca.plt_call.MD5_Init)[0x103d7180]
      linux-vdso64.so.1(__kernel_sigtramp_rt64+0x0)[0xfff8fa30448]
      /opt/at7.0/lib64/power7/libc.so.6(gsignal-0x16f708)[0xfff8f1cf8f0]
      /opt/at7.0/lib64/power7/libc.so.6(abort-0x16dab4)[0xfff8f1d19c4]
      sync/sync0rw.cc:690(000000ca.plt_call.MD5_Init)[0x10124318]
      sync/sync0rw.cc:834(000000ca.plt_call.MD5_Init)[0x107d2b08]
      include/sync0rw.ic:917(pfs_rw_lock_x_lock_func)[0x108329b4]
      include/btr0sea.ic:81(000000ca.plt_call.MD5_Init)[0x1081c85c]
      include/btr0pcur.ic:485(btr_pcur_open_with_no_init_func)[0x107b3c74]
      handler/ha_innodb.cc:8374(000000ca.plt_call.MD5_Init)[0x106dfc4c]
      sql/handler.h:2888(000000ca.plt_call.MD5_Init)[0x103e8f64]
      sql/handler.cc:5520(000000ca.plt_call.MD5_Init)[0x103d7e6c]
      sql/handler.cc:2609(000000ca.plt_call.MD5_Init)[0x103de780]
      sql/sql_select.cc:18167(000000ca.plt_call.MD5_Init)[0x1023b53c]
      sql/table.h:1366(disable_keyread)[0x10115844]
      sql/sql_select.cc:3785(000000ca.plt_call.MD5_Init)[0x1011980c]
      sql/sql_select.cc:1338(optimize_inner)[0x1026b5fc]
      sql/sql_select.cc:3289(mysql_select)[0x10270180]
      ...
      Query (0xff68001a850): SELECT c FROM sbtest18 WHERE id=4968
      Connection ID (thread ID): 1287
      

      This is MariaDB-10.0, bzr revision 4308, compiled with ATC 7.0. Unlike previous (working) binaries, this one is using libaio.

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

              Hide
              svoj Sergey Vojtovich added a comment -

              Below are my comments on InnoDB memory barriers framework. I will post additional comment on correctness of barriers when I complete review.

              • No action: non-atomic loads/stores of shared variables is evil. But nobody seem to care about it since all loads are 32-bit, which are known to be atomic.
              • No action: my_atomic.h doesn't need cmake probes - all checks are done via ifdef-s. To my taste it is more compact. But since InnoDB accepted memory barriers patch with cmake probes we probably shouldn't bother about it either.
              • No action: we wondered about reasons for reducing number of spins. There is a comment added along with rev.6004 to MySQL-5.6.20: internal counter for innodb_sync_spin_loops is adjusted because memory barrier is more expensive than an empty loop.
              • Action: we miss definition of HAVE_WINDOWS_MM_FENCE in CMakeLists.txt. See how it is handled in rev.6004 of MySQL-5.6.20.
              • # define os_rmb do { } while(0) and # define os_wmb do { } while(0)
                do {} while(0) is excessive, just #define os_rmb should be fine.
              Show
              svoj Sergey Vojtovich added a comment - Below are my comments on InnoDB memory barriers framework. I will post additional comment on correctness of barriers when I complete review. No action: non-atomic loads/stores of shared variables is evil. But nobody seem to care about it since all loads are 32-bit, which are known to be atomic. No action: my_atomic.h doesn't need cmake probes - all checks are done via ifdef-s. To my taste it is more compact. But since InnoDB accepted memory barriers patch with cmake probes we probably shouldn't bother about it either. No action: we wondered about reasons for reducing number of spins. There is a comment added along with rev.6004 to MySQL-5.6.20: internal counter for innodb_sync_spin_loops is adjusted because memory barrier is more expensive than an empty loop. Action: we miss definition of HAVE_WINDOWS_MM_FENCE in CMakeLists.txt. See how it is handled in rev.6004 of MySQL-5.6.20. # define os_rmb do { } while(0) and # define os_wmb do { } while(0) do {} while(0) is excessive, just #define os_rmb should be fine.
              Hide
              svoj Sergey Vojtovich added a comment -

              On memory barriers in mutexes:

              - mutex_get_waiters() miss acquire memory barrier. This may cause
                mutex_exit_func() read stale 'waiters' value and be the reason
                for deadlock.
              
                There seem to be a workaround for that: srv_error_monitor_thread()
                is supposed to wake these stale threads every second. But if that's
                the case, we don't really need release memory barrier in
                mutex_set_waiters().
              
              - ib_mutex_test_and_set(): release memory barrier must not be needed,
                we hold mutex anyway and don't care at which point lock_word will
                become visible to other threads.
              
              - mutex_get_lock_word(): acquire memory barrier should not be needed.
              
              Show
              svoj Sergey Vojtovich added a comment - On memory barriers in mutexes: - mutex_get_waiters() miss acquire memory barrier. This may cause mutex_exit_func() read stale 'waiters' value and be the reason for deadlock. There seem to be a workaround for that: srv_error_monitor_thread() is supposed to wake these stale threads every second. But if that's the case, we don't really need release memory barrier in mutex_set_waiters(). - ib_mutex_test_and_set(): release memory barrier must not be needed, we hold mutex anyway and don't care at which point lock_word will become visible to other threads. - mutex_get_lock_word(): acquire memory barrier should not be needed.
              Hide
              svoj Sergey Vojtovich added a comment -

              Neither of acquire memory barriers in sync_arr_cell_can_wake_up() should be needed.

              Show
              svoj Sergey Vojtovich added a comment - Neither of acquire memory barriers in sync_arr_cell_can_wake_up() should be needed.
              Hide
              svoj Sergey Vojtovich added a comment -
              revno: 3413.65.7
              revision-id: monty@mariadb.org-20140819162835-sorv0ogd39f7mui8
              parent: knielsen@knielsen-hq.org-20140813134639-wk760plnzg5wu4x8
              committer: Michael Widenius <monty@mariadb.org>
              branch nick: maria-5.5
              timestamp: Tue 2014-08-19 19:28:35 +0300
              message:
              MDEV-6450 - MariaDB crash on Power8 when built with advance tool chain
              
              Part of this work is based on Stewart Smitch's memory barrier and lower priori
              patches for power8.
              
              - Added memory syncronization for innodb & xtradb for power8.
              - Added HAVE_WINDOWS_MM_FENCE to CMakeList.txt
              - Added os_isync to fix a syncronization problem on power
              - Added log_get_lsn_nowait which is now used srv_error_monitor_thread to ensur
                if log mutex is locked.
              
              All changes done both for InnoDB and Xtradb
              
              Show
              svoj Sergey Vojtovich added a comment - revno: 3413.65.7 revision-id: monty@mariadb.org-20140819162835-sorv0ogd39f7mui8 parent: knielsen@knielsen-hq.org-20140813134639-wk760plnzg5wu4x8 committer: Michael Widenius <monty@mariadb.org> branch nick: maria-5.5 timestamp: Tue 2014-08-19 19:28:35 +0300 message: MDEV-6450 - MariaDB crash on Power8 when built with advance tool chain Part of this work is based on Stewart Smitch's memory barrier and lower priori patches for power8. - Added memory syncronization for innodb & xtradb for power8. - Added HAVE_WINDOWS_MM_FENCE to CMakeList.txt - Added os_isync to fix a syncronization problem on power - Added log_get_lsn_nowait which is now used srv_error_monitor_thread to ensur if log mutex is locked. All changes done both for InnoDB and Xtradb

                People

                • Assignee:
                  monty Michael Widenius
                  Reporter:
                  axel Axel Schwenke
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  3 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved: