Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6333

A deadlock occured on Galera Clustering

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Incomplete
    • Affects Version/s: 5.5.37-galera
    • Fix Version/s: 5.5.41-galera
    • Component/s: Galera
    • Labels:

      Description

      Our System is using Galera Cluster.

      MariaDB [test]> show variables like ''%wsrep%';
      || *Variable_name* || *Value* ||
      || wsrep_OSU_method || TOI ||
      || wsrep_auto_increment_control || ON ||
      || wsrep_causal_reads || OFF ||
      || wsrep_certify_nonPK || ON ||
      || wsrep_cluster_address || gcomm://xx.xxx.xx.x1,xx.xxx.xx.x2,xx.xxx.xx.x3 ||
      || wsrep_cluster_name || GC ||
      || wsrep_convert_LOCK_to_trx || OFF ||
      || wsrep_data_home_dir || /var/lib/mysql/ ||
      || wsrep_dbug_option ||  ||
      || wsrep_debug || OFF ||
      || wsrep_desync || OFF ||
      || wsrep_drupal_282555_workaround || OFF ||
      || wsrep_forced_binlog_format || NONE ||
      || wsrep_load_data_splitting || ON ||
      || wsrep_log_conflicts || OFF ||
      || wsrep_max_ws_rows || 131072 ||
      || wsrep_max_ws_size || 1073741824 ||
      || wsrep_mysql_replication_bundle || 0 ||
      || wsrep_node_address || xx.xxx.xx.x2 ||
      || wsrep_node_incoming_address || AUTO ||
      || wsrep_node_name || GC-1 ||
      || wsrep_notify_cmd ||  ||
      || wsrep_on || ON ||
      || wsrep_provider || /usr/lib64/galera/libgalera_smm.so ||
      || wsrep_provider_options || base_host = xx.xxx.xx.x2; base_port = 4567; cert.log_conflicts = no; debug = no; evs.causal_keepalive_period = PT1S; evs.debug_log_mask = 0x1; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.info_log_mask = 0; evs.install_timeout = PT15S; evs.join_retrans_period = PT1S; evs.keepalive_period = PT1S; evs.max_install_timeouts = 1; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.use_aggregate = true; evs.user_send_window = 2; evs.version = 0; evs.view_forget_timeout = P1D; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.listen_addr = tcp://0.0.0.0:4567; gmcast.mcast_addr = ; gmcast.mcast_ttl = 1; gmcast.peer_timeout = PT3S; gmcast.segment = 0; gmcast.time_wait = PT5S; gmcast.version = 0; ist.recv_addr = xx.xxx.xx.x2; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.linger = PT20S; pc.npvo = false; pc.version = 0; pc.wait_prim = true; pc.wait_prim_timeout = P30S; pc.weight = 1; protonet.backend = asio; protonet.version = 0; repl.causal_read_timeout = PT30S; repl.commit_order = 3; repl.key_format = FLAT8; repl.max_ws_size = 2147483647; repl.proto_max = 5; socket.checksum = 2;  ||
      || wsrep_recover || OFF ||
      || wsrep_replicate_myisam || OFF ||
      || wsrep_restart_slave || OFF ||
      || wsrep_retry_autocommit || 1 ||
      || wsrep_slave_threads || 1 ||
      || wsrep_sst_auth ||  ||
      || wsrep_sst_donor ||  ||
      || wsrep_sst_donor_rejects_queries || OFF ||
      || wsrep_sst_method || rsync ||
      || wsrep_sst_receive_address || AUTO ||
      || wsrep_start_position || 8e663ba7-f123-11e3-88dc-dfe448d1c69c:1339436 ||
      

      And our DB nodes' auto_increment settings are
      Node #1

      MariaDB [test]> show variables like '%auto_increment%';
      +------------------------------+-------+
      | Variable_name                | Value |
      +------------------------------+-------+
      | auto_increment_increment     | 3     |
      | auto_increment_offset        | 1     |
      | wsrep_auto_increment_control | ON    |
      +------------------------------+-------+
      

      Node #2

      MariaDB [test]> show variables like '%auto_increment%';
      +------------------------------+-------+
      | Variable_name                | Value |
      +------------------------------+-------+
      | auto_increment_increment     | 3     |
      | auto_increment_offset        | 2     |
      | wsrep_auto_increment_control | ON    |
      +------------------------------+-------+
      

      Node #3

      MariaDB [test]> show variables like '%auto_increment%';
      +------------------------------+-------+
      | Variable_name                | Value |
      +------------------------------+-------+
      | auto_increment_increment     | 3     |
      | auto_increment_offset        | 3     |
      | wsrep_auto_increment_control | ON    |
      +------------------------------+-------+
      

      This setting is same as with the post on blog.mariadb.org ( https://blog.mariadb.org/auto-increments-in-galera/ ).
      But In our System. While doing update logic in Transaction Deadlock is still occured.
      Our System is consisted of 2 Agent(Active-Active) ,3 DB nodes(Galera Cluster).
      If Agent 1&2 are connected to only 1 node (ex DB node 1), Deadlock is not occured.
      But if I consist the connection - Agent 1 to DB node 1 & Agent 2 to DB node 2 each, Deadlock is occured.
      What is the problem and How can I solve this Deadlock?.

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            nirbhay_c Nirbhay Choubey added a comment -

            shin
            It looks like auto_increment_offset values are same for Node#1 and Node#3.
            This might cause a deadlock. Were they set manually?

            Show
            nirbhay_c Nirbhay Choubey added a comment - shin It looks like auto_increment_offset values are same for Node#1 and Node#3. This might cause a deadlock. Were they set manually?
            Hide
            shin shin added a comment -

            @Nirbhay Choubey
            Sorry, It's my mistake.
            The auto_increment_offset value of Node#3 is '3', not '1'.
            Actually, the Nodes have different values each other.

            Show
            shin shin added a comment - @Nirbhay Choubey Sorry, It's my mistake. The auto_increment_offset value of Node#3 is '3', not '1'. Actually, the Nodes have different values each other.
            Hide
            danblack Daniel Black added a comment -

            shin, deadlocks are usually an application problem updating the same rows concurrently.

            https://dev.mysql.com/doc/refman/5.6/en/innodb-deadlocks.html

            With galera deadlocks will occur on commit instead of earlier when using a single instance. Perhaps your application isn't looking for a deadlock on commit?

            If you still think this is a bug can you provide table structures from 'show create table x' and the SQL update statements done by the applications (out of binlogs with annotations turned on if a close source application).

            Show
            danblack Daniel Black added a comment - shin , deadlocks are usually an application problem updating the same rows concurrently. https://dev.mysql.com/doc/refman/5.6/en/innodb-deadlocks.html With galera deadlocks will occur on commit instead of earlier when using a single instance. Perhaps your application isn't looking for a deadlock on commit? If you still think this is a bug can you provide table structures from 'show create table x' and the SQL update statements done by the applications (out of binlogs with annotations turned on if a close source application).
            Hide
            nirbhay_c Nirbhay Choubey added a comment -

            shin, As Daniel Black suggested, if you still think the issue is related to server, then kindly reopen it and provide us with table structure (SHOW CREATE TABLE <table-name>) and the queries that causes deadlock.

            Show
            nirbhay_c Nirbhay Choubey added a comment - shin , As Daniel Black suggested, if you still think the issue is related to server, then kindly reopen it and provide us with table structure (SHOW CREATE TABLE <table-name>) and the queries that causes deadlock.

              People

              • Assignee:
                nirbhay_c Nirbhay Choubey
                Reporter:
                shin shin
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: