Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-7217

MariaDB Node crashes with WSREP: SQL statement was ineffective

    Details

    • Type: Bug
    • Status: Stalled
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 10.0.19-galera, 5.5.38
    • Fix Version/s: None
    • Component/s: Galera
    • Labels:
    • Environment:
      Centos 6.5

      Description

      Hi there.

      I've encountered an issue with my Galera Cluster.

      I have 4 Nodes clustered across 2 different physical locations with a substantial link between them.

      I've recently had 2 node crashes (different nodes with the same error message)

      I'm getting the following in the server log:

      141124 7:26:54 [Warning] WSREP: SQL statement was ineffective, THD: 832, buf: 175
      QUERY: commit
      => Skipping replication
      141124 7:26:54 [ERROR] WSREP: FSM: no such a transition ROLLED_BACK -> ROLLED_BACK
      141124 7:26:54 [ERROR] mysqld got signal 6 ;
      This could be because you hit a bug. It is also possible that this binary
      or one of the libraries it was linked against is corrupt, improperly built,
      or misconfigured. This error can also be caused by malfunctioning hardware.

      To report this bug, see http://kb.askmonty.org/en/reporting-bugs

      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed,
      something is definitely wrong and this may fail.

      Server version: 5.5.37-MariaDB-wsrep-log
      key_buffer_size=134217728
      read_buffer_size=131072
      max_used_connections=178
      max_threads=501
      thread_count=76
      It is possible that mysqld could use up to
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1225846 K bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.

      Thread pointer: 0x0x7f1d92e94000
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x7f30a38d9c38 thread_stack 0x48000
      /usr/sbin/mysqld(my_print_stacktrace+0x2b)[0xa9226b]
      /usr/sbin/mysqld(handle_fatal_signal+0x398)[0x6ea9d8]
      /lib64/libpthread.so.0[0x349440f710]
      /lib64/libc.so.6(gsignal+0x35)[0x3494032925]
      /lib64/libc.so.6(abort+0x175)[0x3494034105]
      /usr/lib64/galera/libgalera_smm.so(ZN6galera3FSMINS_9TrxHandle5StateENS1_10TransitionENS_10EmptyGuardENS_11EmptyActionEE8shift_toES2+0x2d9)[0x7f31e956b4d9]
      /usr/lib64/galera/libgalera_smm.so(_ZN6galera13ReplicatorSMM13post_rollbackEPNS_9TrxHandleE+0x2e)[0x7f31e958628e]
      /usr/lib64/galera/libgalera_smm.so(galera_post_rollback+0x68)[0x7f31e959fd88]
      /usr/sbin/mysqld[0x691375]
      /usr/sbin/mysqld[0x691488]
      /usr/sbin/mysqld(_Z17ha_rollback_transP3THDb+0xe6)[0x6ed4c6]
      /usr/sbin/mysqld(_Z15ha_commit_transP3THDb+0x1e2)[0x6ed7d2]
      /usr/sbin/mysqld(_Z12trans_commitP3THD+0x45)[0x66ce65]
      /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x2c80)[0x5a7df0]
      /usr/sbin/mysqld[0x5abf94]
      /usr/sbin/mysqld[0x5ac37b]
      /usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1889)[0x5adfd9]
      /usr/sbin/mysqld(_Z10do_commandP3THD+0x11c)[0x5ae73c]
      /usr/sbin/mysqld(_Z26threadpool_process_requestP3THD+0xa7)[0x690457]
      /usr/sbin/mysqld[0x6c2115]
      /lib64/libpthread.so.0[0x34944079d1]
      /lib64/libc.so.6(clone+0x6d)[0x34940e8b5d]

      Trying to get some variables.
      Some pointers may be invalid and cause the dump to abort.
      Query (0x7f1d9558b018): is an invalid pointer
      Connection ID (thread ID): 832
      Status: NOT_KILLED

      Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=off

      The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
      information that should help you find out what is causing the crash.
      141124 07:26:55 mysqld_safe Number of processes running now: 0
      141124 07:26:55 mysqld_safe WSREP: not restarting wsrep node automatically
      141124 07:26:55 mysqld_safe mysqld from pid file /var/lib/mysql/mysql.pid ended

      The following Warning appears multiple times in the logs for both servers

      [Warning] WSREP: SQL statement was ineffective, THD: 832, buf: 175
      QUERY: commit
      => Skipping replication

      I've definitely got all 4 nodes configured with bin_log=ROW and no auto_increment settings so I don't know whats causing this to occur during normal operations.

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            nirbhay_c Nirbhay Choubey added a comment -

            Joerg Thanks for confirming that.
            Andrew Since the bug does not seem to affect the latest versions, I am closing it.
            But do post the queries/transaction in case you hit a crash, so that I can confirm
            if its really fixed.

            Show
            nirbhay_c Nirbhay Choubey added a comment - Joerg Thanks for confirming that. Andrew Since the bug does not seem to affect the latest versions, I am closing it. But do post the queries/transaction in case you hit a crash, so that I can confirm if its really fixed.
            Hide
            Fos Andrew added a comment -

            @Nirbhay I've upgraded all nodes to 10.0.19 and I just got the exact same error on a node in preproduction.

            150701 23:10:45 [ERROR] WSREP: FSM: no such a transition ROLLED_BACK -> ROLLED_BACK
            150701 23:10:45 [ERROR] mysqld got signal 6 ;
            This could be because you hit a bug. It is also possible that this binary
            or one of the libraries it was linked against is corrupt, improperly built,
            or misconfigured. This error can also be caused by malfunctioning hardware.

            To report this bug, see http://kb.askmonty.org/en/reporting-bugs

            We will try our best to scrape up some info that will hopefully help
            diagnose the problem, but since we have already crashed,
            something is definitely wrong and this may fail.

            Server version: 10.0.19-MariaDB-wsrep-log
            key_buffer_size=134217728
            read_buffer_size=31457280
            max_used_connections=139
            max_threads=501
            thread_count=24
            It is possible that mysqld could use up to
            key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 16552907 K bytes of memory
            Hope that's ok; if not, decrease some variables in the equation.

            Thread pointer: 0x0x7f3a844ba008
            Attempting backtrace. You can use the following information to find out
            where mysqld died. If you see no messages after this, something went
            terribly wrong...
            stack_bottom = 0x7f3a6affec78 thread_stack 0x48000
            /usr/sbin/mysqld(my_print_stacktrace+0x2b)[0xba1e4b]
            /usr/sbin/mysqld(handle_fatal_signal+0x398)[0x748948]
            /lib64/libpthread.so.0(+0xf710)[0x7f3b6e594710]
            /lib64/libc.so.6(gsignal+0x35)[0x7f3b6cbf3925]
            /lib64/libc.so.6(abort+0x175)[0x7f3b6cbf5105]
            /usr/lib64/galera/libgalera_smm.so(ZN6galera3FSMINS_9TrxHandle5StateENS1_10TransitionENS_10EmptyGuardENS_11EmptyActionEE8shift_toES2+0x2d9)[0x7f3b6656b4d9]
            /usr/lib64/galera/libgalera_smm.so(_ZN6galera13ReplicatorSMM13post_rollbackEPNS_9TrxHandleE+0x2e)[0x7f3b6658628e]
            /usr/lib64/galera/libgalera_smm.so(galera_post_rollback+0x68)[0x7f3b6659fd88]
            /usr/sbin/mysqld[0x6e9e75]
            /usr/sbin/mysqld[0x6e9ee8]
            /usr/sbin/mysqld(_Z17ha_rollback_transP3THDb+0xde)[0x74b76e]
            /usr/sbin/mysqld(_Z15ha_commit_transP3THDb+0x200)[0x74bae0]
            /usr/sbin/mysqld(_Z12trans_commitP3THD+0x4c)[0x6b4f4c]
            /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x1020)[0x5de1e0]
            /usr/sbin/mysqld[0x5e54f7]
            /usr/sbin/mysqld[0x5e5ebb]
            /usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x19fb)[0x5e7dfb]
            /usr/sbin/mysqld(_Z10do_commandP3THD+0x1e2)[0x5e86c2]
            /usr/sbin/mysqld(_Z26threadpool_process_requestP3THD+0xa7)[0x6dd3b7]
            /usr/sbin/mysqld[0x71d99d]
            /lib64/libpthread.so.0(+0x79d1)[0x7f3b6e58c9d1]
            /lib64/libc.so.6(clone+0x6d)[0x7f3b6cca9b5d]

            Trying to get some variables.
            Some pointers may be invalid and cause the dump to abort.
            Query (0x7f3a6a9a5020): is an invalid pointer
            Connection ID (thread ID): 2363
            Status: NOT_KILLED*strong text*

            Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on

            The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
            information that should help you find out what is causing the crash.
            150701 23:10:48 mysqld_safe Number of processes running now: 0
            150701 23:10:48 mysqld_safe WSREP: not restarting wsrep node automatically
            150701 23:10:48 mysqld_safe mysqld from pid file /var/lib/mysql/mysql.pid ended

            Show
            Fos Andrew added a comment - @Nirbhay I've upgraded all nodes to 10.0.19 and I just got the exact same error on a node in preproduction. 150701 23:10:45 [ERROR] WSREP: FSM: no such a transition ROLLED_BACK -> ROLLED_BACK 150701 23:10:45 [ERROR] mysqld got signal 6 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. To report this bug, see http://kb.askmonty.org/en/reporting-bugs We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. Server version: 10.0.19-MariaDB-wsrep-log key_buffer_size=134217728 read_buffer_size=31457280 max_used_connections=139 max_threads=501 thread_count=24 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 16552907 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. Thread pointer: 0x0x7f3a844ba008 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x7f3a6affec78 thread_stack 0x48000 /usr/sbin/mysqld(my_print_stacktrace+0x2b) [0xba1e4b] /usr/sbin/mysqld(handle_fatal_signal+0x398) [0x748948] /lib64/libpthread.so.0(+0xf710) [0x7f3b6e594710] /lib64/libc.so.6(gsignal+0x35) [0x7f3b6cbf3925] /lib64/libc.so.6(abort+0x175) [0x7f3b6cbf5105] /usr/lib64/galera/libgalera_smm.so( ZN6galera3FSMINS_9TrxHandle5StateENS1_10TransitionENS_10EmptyGuardENS_11EmptyActionEE8shift_toES2 +0x2d9) [0x7f3b6656b4d9] /usr/lib64/galera/libgalera_smm.so(_ZN6galera13ReplicatorSMM13post_rollbackEPNS_9TrxHandleE+0x2e) [0x7f3b6658628e] /usr/lib64/galera/libgalera_smm.so(galera_post_rollback+0x68) [0x7f3b6659fd88] /usr/sbin/mysqld [0x6e9e75] /usr/sbin/mysqld [0x6e9ee8] /usr/sbin/mysqld(_Z17ha_rollback_transP3THDb+0xde) [0x74b76e] /usr/sbin/mysqld(_Z15ha_commit_transP3THDb+0x200) [0x74bae0] /usr/sbin/mysqld(_Z12trans_commitP3THD+0x4c) [0x6b4f4c] /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x1020) [0x5de1e0] /usr/sbin/mysqld [0x5e54f7] /usr/sbin/mysqld [0x5e5ebb] /usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x19fb) [0x5e7dfb] /usr/sbin/mysqld(_Z10do_commandP3THD+0x1e2) [0x5e86c2] /usr/sbin/mysqld(_Z26threadpool_process_requestP3THD+0xa7) [0x6dd3b7] /usr/sbin/mysqld [0x71d99d] /lib64/libpthread.so.0(+0x79d1) [0x7f3b6e58c9d1] /lib64/libc.so.6(clone+0x6d) [0x7f3b6cca9b5d] Trying to get some variables. Some pointers may be invalid and cause the dump to abort. Query (0x7f3a6a9a5020): is an invalid pointer Connection ID (thread ID): 2363 Status: NOT_KILLED*strong text* Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains information that should help you find out what is causing the crash. 150701 23:10:48 mysqld_safe Number of processes running now: 0 150701 23:10:48 mysqld_safe WSREP: not restarting wsrep node automatically 150701 23:10:48 mysqld_safe mysqld from pid file /var/lib/mysql/mysql.pid ended
            Hide
            nirbhay_c Nirbhay Choubey added a comment - - edited

            Andrew Ok thanks, reopened this issue for further investigation.

            Show
            nirbhay_c Nirbhay Choubey added a comment - - edited Andrew Ok thanks, reopened this issue for further investigation.
            Hide
            Fos Andrew added a comment -

            @Nirbhay thanks. I've enabled wsrep_debug and general logging so I can try get additional information.

            Show
            Fos Andrew added a comment - @Nirbhay thanks. I've enabled wsrep_debug and general logging so I can try get additional information.
            Hide
            nirbhay_c Nirbhay Choubey added a comment -

            Andrew Thanks! Please update when you find something of interest.

            Show
            nirbhay_c Nirbhay Choubey added a comment - Andrew Thanks! Please update when you find something of interest.

              People

              • Assignee:
                nirbhay_c Nirbhay Choubey
                Reporter:
                Fos Andrew
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated: