Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6740

Galera crash in rpl_sql_thread_info/cached_charset_compare

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 10.0.13-galera
    • Fix Version/s: 10.0.14-galera
    • Component/s: Galera
    • Labels:
      None
    • Environment:
      RHEL 6.5 x86-64

      Description

      I'm doing INSERT on one node and UPDATE on another. It often leads to a crash on the node where I'm executing UPDATE.

      Thread pointer: 0x0x7f711aff3008
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x7f710b5b4ce0 thread_stack 0x48000
      /usr/sbin/mysqld(my_print_stacktrace+0x2b)[0xb9541b]
      /usr/sbin/mysqld(handle_fatal_signal+0x398)[0x744b48]
      /lib64/libpthread.so.0[0x3166e0f710]
      /lib64/libc.so.6[0x3166b3f3c0]
      /usr/sbin/mysqld(_ZNK19rpl_sql_thread_info22cached_charset_compareEPc+0x20)[0x69b460]
      /usr/sbin/mysqld(_ZN15Query_log_event14do_apply_eventEP14rpl_group_infoPKcj+0x86f)[0x802bdf]
      /usr/sbin/mysqld(_Z14wsrep_apply_cbPvPKvmjPK14wsrep_trx_meta+0x525)[0x6f1c05]
      /usr/lib64/galera/libgalera_smm.so(ZNK6galera9TrxHandle5applyEPvPF15wsrep_cb_statusS1_PKvmjPK14wsrep_trx_metaERS6+0xb1)[0x7f71409542c1]
      /usr/lib64/galera/libgalera_smm.so(+0x1aaf95)[0x7f714098bf95]
      /usr/lib64/galera/libgalera_smm.so(_ZN6galera13ReplicatorSMM10replay_trxEPNS_9TrxHandleEPv+0x12e)[0x7f714098c85e]
      /usr/lib64/galera/libgalera_smm.so(galera_replay_trx+0x5c)[0x7f71409a045c]
      /usr/sbin/mysqld(_Z24wsrep_replay_transactionP3THD+0x2de)[0x6f379e]
      /usr/sbin/mysqld[0x5e24b0]
      /usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x16d0)[0x5e3c30]
      /usr/sbin/mysqld(_Z10do_commandP3THD+0x132)[0x5e4402]
      /usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x54b)[0x6a31cb]
      /usr/sbin/mysqld(handle_one_connection+0x42)[0x6a32c2]
      /lib64/libpthread.so.0[0x3166e079d1]
      /lib64/libc.so.6(clone+0x6d)[0x3166ae886d]

      Trying to get some variables.
      Some pointers may be invalid and cause the dump to abort.
      Query (0x7f711b19f351): UPDATE t1 SET v=v+1 WHERE k=30
      Connection ID (thread ID): 5
      Status: NOT_KILLED

      Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on

      Tail of general log:

      140912 17:02:44 1 Query INSERT INTO t1 (k,j) VALUES (29,11)
      1 Query INSERT INTO t1 (k,j) VALUES (29,12)
      1 Query INSERT INTO t1 (k,j) VALUES (29,13)
      1 Query INSERT INTO t1 (k,j) VALUES (29,14)
      1 Query INSERT INTO t1 (k,j) VALUES (29,15)
      1 Query INSERT INTO t1 (k,j) VALUES (29,16)
      1 Query INSERT INTO t1 (k,j) VALUES (29,17)
      1 Query INSERT INTO t1 (k,j) VALUES (29,18)
      1 Query INSERT INTO t1 (k,j) VALUES (29,19)
      5 Query UPDATE t1 SET v=v+1 WHERE k=29
      1 Query INSERT INTO t1 (k,j) VALUES (30,0)
      1 Query INSERT INTO t1 (k,j) VALUES (30,1)
      1 Query INSERT INTO t1 (k,j) VALUES (30,2)
      1 Query INSERT INTO t1 (k,j) VALUES (30,3)
      1 Query INSERT INTO t1 (k,j) VALUES (30,4)
      1 Query INSERT INTO t1 (k,j) VALUES (30,5)
      1 Query INSERT INTO t1 (k,j) VALUES (30,6)
      1 Query INSERT INTO t1 (k,j) VALUES (30,7)
      1 Query INSERT INTO t1 (k,j) VALUES (30,8)
      1 Query INSERT INTO t1 (k,j) VALUES (30,9)
      1 Query INSERT INTO t1 (k,j) VALUES (30,10)
      1 Query INSERT INTO t1 (k,j) VALUES (30,11)
      1 Query INSERT INTO t1 (k,j) VALUES (30,12)
      1 Query INSERT INTO t1 (k,j) VALUES (30,13)
      1 Query INSERT INTO t1 (k,j) VALUES (30,14)
      1 Query INSERT INTO t1 (k,j) VALUES (30,15)
      1 Query INSERT INTO t1 (k,j) VALUES (30,16)
      1 Query INSERT INTO t1 (k,j) VALUES (30,17)
      5 Query UPDATE t1 SET v=v+1 WHERE k=30
      1 Query INSERT INTO t1 (k,j) VALUES (30,18)
      1 Query INSERT INTO t1 (k,j) VALUES (30,19)

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            nirbhay_c Nirbhay Choubey added a comment -

            Kolbe Kegel I have tried to come up with a simple test which does INSERT and UPDATE on 2 nodes. But it doesn't lead to crash. Can you try this with perhaps a modified table structure & queries and see it reproduces the issue?

            Show
            nirbhay_c Nirbhay Choubey added a comment - Kolbe Kegel I have tried to come up with a simple test which does INSERT and UPDATE on 2 nodes. But it doesn't lead to crash. Can you try this with perhaps a modified table structure & queries and see it reproduces the issue?
            Hide
            kolbe Kolbe Kegel added a comment -

            In what way would you like me to modify the table structure or queries? I was easily able to reproduce this issue when doing my original testing ... so I'm not very inclined to modify my test unless you can give me some guidance about what you'd like me to do and why.

            Can I help you gather additional information about what is happening here?

            I did this testing in 3 ec2 instances and repeating the problem has been very easy. I can give you access to the ec2 instances if you'd like.

            Here's the table structure:

            CREATE TABLE `t1` (
            `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
            `k` int(11) DEFAULT NULL,
            `v` int(11) DEFAULT '0',
            `j` int(11) DEFAULT NULL,
            PRIMARY KEY (`id`),
            KEY `k` (`k`)
            ) ENGINE=InnoDB;

            Show
            kolbe Kolbe Kegel added a comment - In what way would you like me to modify the table structure or queries? I was easily able to reproduce this issue when doing my original testing ... so I'm not very inclined to modify my test unless you can give me some guidance about what you'd like me to do and why. Can I help you gather additional information about what is happening here? I did this testing in 3 ec2 instances and repeating the problem has been very easy. I can give you access to the ec2 instances if you'd like. Here's the table structure: CREATE TABLE `t1` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `k` int(11) DEFAULT NULL, `v` int(11) DEFAULT '0', `j` int(11) DEFAULT NULL, PRIMARY KEY (`id`), KEY `k` (`k`) ) ENGINE=InnoDB;
            Hide
            kolbe Kolbe Kegel added a comment -

            test program. give an integer as 1st positional parameter and it'll be used as the value for wsrep_sync_wait.

            Show
            kolbe Kolbe Kegel added a comment - test program. give an integer as 1st positional parameter and it'll be used as the value for wsrep_sync_wait.
            Hide
            nirbhay_c Nirbhay Choubey added a comment -

            I was referring to the test script that I uploaded.

            Show
            nirbhay_c Nirbhay Choubey added a comment - I was referring to the test script that I uploaded.
            Hide
            kolbe Kolbe Kegel added a comment -

            For one thing your table doesn't have a primary key, which is not good for Galera... and you create a new connection for each statement you send to the server, which adds a lot of overhead and slows it down so much that maybe that alone avoids the problem. Plus your test program does all the inserts and then does the updates which is definitely different from what I was doing... sorry for the confusion. You'll see I've attached my test program so you can get a better view of what I was doing.

            Show
            kolbe Kolbe Kegel added a comment - For one thing your table doesn't have a primary key, which is not good for Galera... and you create a new connection for each statement you send to the server, which adds a lot of overhead and slows it down so much that maybe that alone avoids the problem. Plus your test program does all the inserts and then does the updates which is definitely different from what I was doing... sorry for the confusion. You'll see I've attached my test program so you can get a better view of what I was doing.
            Hide
            nirbhay_c Nirbhay Choubey added a comment -

            Ok, will try your script.

            Show
            nirbhay_c Nirbhay Choubey added a comment - Ok, will try your script.
            Show
            nirbhay_c Nirbhay Choubey added a comment - http://lists.askmonty.org/pipermail/commits/2014-September/006606.html
            Hide
            jplindst Jan Lindström added a comment -

            Ok to push.

            Show
            jplindst Jan Lindström added a comment - Ok to push.
            Show
            nirbhay_c Nirbhay Choubey added a comment - Pushed to maria-10.0-galera. http://bazaar.launchpad.net/~maria-captains/maria/maria-10.0-galera/revision/3893

              People

              • Assignee:
                nirbhay_c Nirbhay Choubey
                Reporter:
                kolbe Kolbe Kegel
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: