Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5089

possible deadlocks between rwlocks and mutexes

    Details

    • Type: Task
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Fix Version/s: 5.5.40
    • Component/s: OTHER
    • Labels:
      None

      Description

      safe_mutex has built-in deadlock detection, but it only works for mutexes.
      Still, compiling with -DUSE_MUTEX_INSTEAD_OF_RW_LOCKS we can magically turn all rwlocks into mutexes.

      When done, safe_mutex reports new locking order violations, the server doesn't even start.

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

              Hide
              svoj Sergey Vojtovich added a comment - - edited

              SHOW VARIABLES may acquire LOCK_system_variables_hash read lock twice: in fill_variables() and then in intern_sys_var_ptr(). According to pthread_rwlock_rdlock() manual it seems to be acceptable: "A thread may hold multiple concurrent read locks on rwlock (that is, successfully call the pthread_rwlock_rdlock() function n times)."

              The same functions also have mixed lock order:
              fill_variables(): mysql_rwlock_rdlock(&LOCK_system_variables_hash);
              show_status_array(): mysql_mutex_lock(&LOCK_global_system_variables);
              intern_sys_var_ptr(): mysql_rwlock_rdlock(&LOCK_system_variables_hash);

              The above should also be acceptable because we don't acquire any locks while we're under wrlock(&LOCK_system_variables_hash).

              Will keep this code intact.

              Show
              svoj Sergey Vojtovich added a comment - - edited SHOW VARIABLES may acquire LOCK_system_variables_hash read lock twice: in fill_variables() and then in intern_sys_var_ptr(). According to pthread_rwlock_rdlock() manual it seems to be acceptable: "A thread may hold multiple concurrent read locks on rwlock (that is, successfully call the pthread_rwlock_rdlock() function n times)." The same functions also have mixed lock order: fill_variables(): mysql_rwlock_rdlock(&LOCK_system_variables_hash); show_status_array(): mysql_mutex_lock(&LOCK_global_system_variables); intern_sys_var_ptr(): mysql_rwlock_rdlock(&LOCK_system_variables_hash); The above should also be acceptable because we don't acquire any locks while we're under wrlock(&LOCK_system_variables_hash). Will keep this code intact.
              Hide
              svoj Sergey Vojtovich added a comment -

              Pushed one minor revision to 5.5.37:
              revno: 4109
              revision-id: svoj@mariadb.org-20140213074049-wo2l3qdtgi0s2mjd
              parent: psergey@askmonty.org-20140311180702-1pntx903p1df1fyn
              committer: Sergey Vojtovich <svoj@mariadb.org>
              branch nick: 5.5
              timestamp: Thu 2014-02-13 11:40:49 +0400
              message:
              MDEV-5089 - possible deadlocks between rwlocks and mutexes

              Pre-MDL versions had direct relationship between LOCK_open and
              LOCK_global_system_variables, e.g.:
              intern_sys_var_ptr // locks LOCK_global_system_variable
              mysql_sys_var_char
              create_options_are_valid
              ha_innobase::create
              handler::ha_create
              ha_create_table
              rea_create_table
              mysql_create_table_no_lock // locks LOCK_open
              mysql_create_table

              With MDL this relationship was removed, but mutex order was still
              recorded. In fact there is indirect relationship between LOCK_open
              and LOCK_global_system_variables via rwlocks in reverse order.

              Removed LOCK_open and LOCK_global_system_variables order recording,
              instead assert that LOCK_open is never held in intern_sys_var_ptr().

              This solves only one of many problems detected with MDEV-5089.

              Show
              svoj Sergey Vojtovich added a comment - Pushed one minor revision to 5.5.37: revno: 4109 revision-id: svoj@mariadb.org-20140213074049-wo2l3qdtgi0s2mjd parent: psergey@askmonty.org-20140311180702-1pntx903p1df1fyn committer: Sergey Vojtovich <svoj@mariadb.org> branch nick: 5.5 timestamp: Thu 2014-02-13 11:40:49 +0400 message: MDEV-5089 - possible deadlocks between rwlocks and mutexes Pre-MDL versions had direct relationship between LOCK_open and LOCK_global_system_variables, e.g.: intern_sys_var_ptr // locks LOCK_global_system_variable mysql_sys_var_char create_options_are_valid ha_innobase::create handler::ha_create ha_create_table rea_create_table mysql_create_table_no_lock // locks LOCK_open mysql_create_table With MDL this relationship was removed, but mutex order was still recorded. In fact there is indirect relationship between LOCK_open and LOCK_global_system_variables via rwlocks in reverse order. Removed LOCK_open and LOCK_global_system_variables order recording, instead assert that LOCK_open is never held in intern_sys_var_ptr(). This solves only one of many problems detected with MDEV-5089 .
              Hide
              svoj Sergey Vojtovich added a comment -

              There is read-lock in reverse order in Aria:

              ha_maria::repair();
                lock(share->intern_lock);
                _ma_update_auto_increment_key()/maria_rlast()/maria_rprev();
                  rdlock(keyinfo->root_lock);
                  unlock(keyinfo->root_lock);
                unlock(share->intern_lock);
              

              Which may conflict with write-lock e.g. in maria_write():

              maria_write();
                wrlock(keyinfo->root_lock);
                _ma_ck_write_btree()/.../_ma_new();
                  lock(share->intern_lock);
                  unlock(share->intern_lock);
                unlock(keyinfo->root_lock);
              

              But since repair code is executed with protection of exclusive lock (TL_WRITE) deadlock doesn't seem to be possible.

              Show
              svoj Sergey Vojtovich added a comment - There is read-lock in reverse order in Aria: ha_maria::repair(); lock(share->intern_lock); _ma_update_auto_increment_key()/maria_rlast()/maria_rprev(); rdlock(keyinfo->root_lock); unlock(keyinfo->root_lock); unlock(share->intern_lock); Which may conflict with write-lock e.g. in maria_write(): maria_write(); wrlock(keyinfo->root_lock); _ma_ck_write_btree()/.../_ma_new(); lock(share->intern_lock); unlock(share->intern_lock); unlock(keyinfo->root_lock); But since repair code is executed with protection of exclusive lock (TL_WRITE) deadlock doesn't seem to be possible.
              Hide
              svoj Sergey Vojtovich added a comment -

              In 5.5 there is possible deadlock between 3 mutexes and 2 rwlocks:

              lock(LOCK_open)                    -> rdlock(LOCK_grant)                  SELECT * FROM INFORMATION_SCHEMA.COLUMNS
              wrlock(LOCK_grant)                 -> lock(acl_cache->lock)               GRANT/REVOKE CREATE/DROP USER
              lock(acl_cache->lock)              -> lock(LOCK_global_system_variables)  FLUSH PRIVILEGES
              lock(LOCK_global_system_variables) -> wrlock(LOCK_logger)                 SET @@global.log_output="TABLE"
              rdlock(LOCK_logger)                -> lock(LOCK_open)                     SELECT 1
              

              But threads are serialized by table-level locks. E.g. GRANT/etc acquires write-lock and FLUSH PRIVILEGES acquires read-lock. No actual deadlock possible.

              In 10.0 things are even better: there is no LOCK_open -> LOCK_grant order.

              Show
              svoj Sergey Vojtovich added a comment - In 5.5 there is possible deadlock between 3 mutexes and 2 rwlocks: lock(LOCK_open) -> rdlock(LOCK_grant) SELECT * FROM INFORMATION_SCHEMA.COLUMNS wrlock(LOCK_grant) -> lock(acl_cache->lock) GRANT/REVOKE CREATE/DROP USER lock(acl_cache->lock) -> lock(LOCK_global_system_variables) FLUSH PRIVILEGES lock(LOCK_global_system_variables) -> wrlock(LOCK_logger) SET @@global.log_output="TABLE" rdlock(LOCK_logger) -> lock(LOCK_open) SELECT 1 But threads are serialized by table-level locks. E.g. GRANT/etc acquires write-lock and FLUSH PRIVILEGES acquires read-lock. No actual deadlock possible. In 10.0 things are even better: there is no LOCK_open -> LOCK_grant order.
              Hide
              svoj Sergey Vojtovich added a comment -

              All issues found in 5.5 are reported/fixed. No extra problems were detected in 10.0 and 10.1.

              Show
              svoj Sergey Vojtovich added a comment - All issues found in 5.5 are reported/fixed. No extra problems were detected in 10.0 and 10.1.

                People

                • Assignee:
                  svoj Sergey Vojtovich
                  Reporter:
                  serg Sergei Golubchik
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  3 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved: