Details
-
Type:
Bug
-
Status: Closed
-
Priority:
Major
-
Resolution: Fixed
-
Affects Version/s: None
-
Fix Version/s: 5.5.33a-galera
-
Component/s: None
-
Labels:
Description
It might well be some other Galera limitation, but I haven't found which particular one it could be. The flow is mainly HANDLER flow, and also some DML and ALTER TABLE.
I execute it on two nodes in a 3-node cluster, 1 thread per node. Soon after start, the following message appears in the server error log(s):
safe_mutex: Found wrong usage of mutex 'LOCK_thd_data' and 'LOCK_wsrep_thd' Mutex currently locked (in reverse order): LOCK_wsrep_thd maria-5.5-galera/sql/sql_class.cc line 858 LOCK_thd_data maria-5.5-galera/sql/sql_base.cc line 9348 rwlock->lock maria-5.5-galera/mysys/thr_rwlock.c line 392
Either immediately or soon afterwards, both nodes stop, at least one on ALTER table, and another one on ALTER or on a DML statement.
ALTER from a client on one of nodes is shown as Killed, but doesn't disappear from the processlist. ALTER from a replication thread is waiting for a lock:
+----+-------------+-----------------+------+---------+------+---------------------------------+----------------------------------------------------------------------------------------------+----------+ | Id | User | Host | db | Command | Time | State | Info | Progress | +----+-------------+-----------------+------+---------+------+---------------------------------+----------------------------------------------------------------------------------------------+----------+ | 1 | system user | | test | Sleep | 484 | Waiting for table metadata lock | ALTER TABLE `table1000_innodb_int_autoinc` DROP COLUMN filler | 0.000 | | 2 | system user | | NULL | Sleep | 566 | wsrep aborter idle | NULL | 0.000 | | 6 | root | localhost:43415 | test | Killed | 484 | checking permissions | ALTER TABLE `table1000_innodb_int_autoinc` ADD COLUMN filler VARCHAR(255) DEFAULT ' filler ' | 0.000 | | 11 | root | localhost:43428 | test | Query | 0 | sleeping | show full processlist | 0.000 | +----+-------------+-----------------+------+---------+------+---------------------------------+----------------------------------------------------------------------------------------------+----------+
Another node is also waiting for something:
+----+-------------+-----------------+------+---------+------+--------------------+--------------------------------------------------------------------+----------+ | Id | User | Host | db | Command | Time | State | Info | Progress | +----+-------------+-----------------+------+---------+------+--------------------+--------------------------------------------------------------------+----------+ | 1 | system user | | NULL | Sleep | 681 | wsrep aborter idle | NULL | 0.000 | | 2 | system user | | NULL | Sleep | 557 | committed 91 | NULL | 0.000 | | 10 | root | localhost:41929 | test | Query | 543 | query end | UPDATE IGNORE `table1000_innodb_int_autoinc` SET `col_int_key` = 0 | 0.000 | | 18 | root | localhost:41947 | test | Query | 0 | sleeping | show full processlist | 0.000 | +----+-------------+-----------------+------+---------+------+--------------------+--------------------------------------------------------------------+----------+
Selected threads:
Thread 2 (Thread 0x7f8abc061700 (LWP 20603)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007f8ac7c2f7de in _gcs_sm_enqueue_common (cond=0x7f8abc05e980, sm=0x7f8ac9c0e010) at gcs/src/gcs_sm.h:160 #2 gcs_sm_enter (scheduled=<optimized out>, cond=0x7f8abc05e980, sm=0x7f8ac9c0e010) at gcs/src/gcs_sm.h:238 #3 gcs_repl (conn=0x3c29490, act=0x7f8abc05ecf0, scheduled=<optimized out>) at gcs/src/gcs.c:1490 #4 0x00007f8ac7c81b1c in repl (scheduled=true, act=..., this=0x3c23a00) at galera/src/gcs.hpp:106 #5 galera::ReplicatorSMM::replicate (this=0x3c23510, trx=0x7f8a6c01c800) at galera/src/replicator_smm.cpp:552 #6 0x00007f8ac7c993ff in galera_pre_commit (gh=<optimized out>, conn_id=<optimized out>, trx_handle=<optimized out>, rbr_data=0x7f8a6c1ce620, rbr_data_len=11293, flags=1, global_seqno=0x504a1a8) at galera/src/wsrep_provider.cpp:388 #7 0x0000000000771833 in wsrep_run_wsrep_commit (thd=0x5047dc0, hton=0x3c49160, all=false) at maria-5.5-galera/sql/wsrep_hton.cc:338 #8 0x0000000000770685 in wsrep_prepare (hton=0x3c49160, thd=0x5047dc0, all=false) at maria-5.5-galera/sql/wsrep_hton.cc:119 #9 0x00000000007e7988 in ha_commit_trans (thd=0x5047dc0, all=false) at maria-5.5-galera/sql/handler.cc:1294 #10 0x0000000000730221 in trans_commit_stmt (thd=0x5047dc0) at maria-5.5-galera/sql/transaction.cc:325 #11 0x0000000000621031 in mysql_execute_command (thd=0x5047dc0) at maria-5.5-galera/sql/sql_parse.cc:4903 #12 0x0000000000624da9 in mysql_parse (thd=0x5047dc0, rawbuf=0x7f8a6c00e078 "UPDATE IGNORE `table1000_innodb_int_autoinc` SET `col_int_key` = 0", length=66, parser_state=0x7f8abc060550) at maria-5.5-galera/sql/sql_parse.cc:6304 #13 0x0000000000623e70 in wsrep_mysql_parse (thd=0x5047dc0, rawbuf=0x7f8a6c00e078 "UPDATE IGNORE `table1000_innodb_int_autoinc` SET `col_int_key` = 0", length=66, parser_state=0x7f8abc060550) at maria-5.5-galera/sql/sql_parse.cc:6069 #14 0x0000000000616335 in dispatch_command (command=COM_QUERY, thd=0x5047dc0, packet=0x4fed231 "UPDATE IGNORE `table1000_innodb_int_autoinc` SET `col_int_key` = 0 ", packet_length=67) at maria-5.5-galera/sql/sql_parse.cc:1245 #15 0x0000000000615114 in do_command (thd=0x5047dc0) at maria-5.5-galera/sql/sql_parse.cc:891 #16 0x000000000071ddd7 in do_handle_one_connection (thd_arg=0x5047dc0) at maria-5.5-galera/sql/sql_connect.cc:1291 #17 0x000000000071d7af in handle_one_connection (arg=0x5047dc0) at maria-5.5-galera/sql/sql_connect.cc:1199 #18 0x00007f8ac92cce9a in start_thread (arg=0x7f8abc061700) at pthread_create.c:308 #19 0x00007f8ac89fdcbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #20 0x0000000000000000 in ?? ()
Thread 21 (Thread 0x7f37490d6700 (LWP 20387)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:215 #1 0x0000000000cfcb90 in safe_cond_timedwait (cond=0x7f3740000c88, mp=0x7f3740000be0, abstime=0x7f37490d03c0, file=0xd9a3c8 "maria-5.5-galera/sql/mdl.cc", line=1202) at maria-5.5-galera/mysys/thr_mutex.c:547 #2 0x0000000000725fc1 in inline_mysql_cond_timedwait (that=0x7f3740000c88, mutex=0x7f3740000be0, abstime=0x7f37490d03c0, src_file=0xd9a3c8 "maria-5.5-galera/sql/mdl.cc", src_line=1202) at maria-5.5-galera/include/mysql/psi/mysql_thread.h:1018 #3 0x0000000000727095 in MDL_wait::timed_wait (this=0x7f3740000be0, thd=0x7f3740000af0, abs_timeout=0x7f37490d03c0, set_status_on_timeout=false, wait_state_name=0xd9a2e8 "Waiting for table metadata lock") at maria-5.5-galera/sql/mdl.cc:1202 #4 0x0000000000728bc0 in MDL_context::acquire_lock (this=0x7f3740000be0, mdl_request=0x7f37490d0480, lock_wait_timeout=31536000) at maria-5.5-galera/sql/mdl.cc:2209 #5 0x000000000072929a in MDL_context::upgrade_shared_lock_to_exclusive (this=0x7f3740000be0, mdl_ticket=0x7f3740085840, lock_wait_timeout=31536000) at maria-5.5-galera/sql/mdl.cc:2389 #6 0x00000000005b7f1a in wait_while_table_is_used (thd=0x7f3740000af0, table=0x7f37400435a0, function=HA_EXTRA_PREPARE_FOR_RENAME) at maria-5.5-galera/sql/sql_base.cc:2347 #7 0x00000000006bf67f in mysql_alter_table (thd=0x7f3740000af0, new_db=0x7f374000ced0 "test", new_name=0x7f374000c868 "table1000_innodb_int_autoinc", create_info=0x7f37490d1f90, table_list=0x7f374000c908, alter_info=0x7f37490d2090, order_num=0, order=0x0, ignore=false, require_online=false) at maria-5.5-galera/sql/sql_table.cc:7008 #8 0x000000000095e20e in Alter_table_statement::execute (this=0x7f374000cf00, thd=0x7f3740000af0) at maria-5.5-galera/sql/sql_alter.cc:122 #9 0x0000000000620c93 in mysql_execute_command (thd=0x7f3740000af0) at maria-5.5-galera/sql/sql_parse.cc:4844 #10 0x0000000000624da9 in mysql_parse (thd=0x7f3740000af0, rawbuf=0x7f374004ec39 "ALTER TABLE `table1000_innodb_int_autoinc` DROP COLUMN filler", length=61, parser_state=0x7f37490d2e80) at maria-5.5-galera/sql/sql_parse.cc:6304 #11 0x00000000008c4b01 in Query_log_event::do_apply_event (this=0x7f3740044be0, rli=0x7f3740010aa0, query_arg=0x7f374004ec39 "ALTER TABLE `table1000_innodb_int_autoinc` DROP COLUMN filler", q_len_arg=61) at maria-5.5-galera/sql/log_event.cc:3792 #12 0x00000000008c404f in Query_log_event::do_apply_event (this=0x7f3740044be0, rli=0x7f3740010aa0) at maria-5.5-galera/sql/log_event.cc:3531 #13 0x0000000000593196 in Log_event::apply_event (this=0x7f3740044be0, rli=0x7f3740010aa0) at maria-5.5-galera/sql/log_event.h:1230 #14 0x000000000062948d in wsrep_apply_rbr (thd=0x7f3740000af0, rbr_buf=0x7f374001f170 "\350\\\"Q\002", buf_len=0) at maria-5.5-galera/sql/sql_parse.cc:8131 #15 0x0000000000629a79 in wsrep_apply_cb (ctx=0x7f3740000af0, buf=0x7f374001f170, buf_len=129, global_seqno=90) at maria-5.5-galera/sql/sql_parse.cc:8210 #16 0x00007f375431eabf in apply_wscoll (trx=..., apply_cb=0x6299cb <wsrep_apply_cb(void*, void const*, unsigned long, long)>, recv_ctx=0x7f3740000af0) at galera/src/replicator_smm.cpp:37 #17 apply_trx_ws (recv_ctx=0x7f3740000af0, apply_cb=0x6299cb <wsrep_apply_cb(void*, void const*, unsigned long, long)>, commit_cb=0x629cd6 <wsrep_commit_cb(void*, long, bool)>, trx=...) at galera/src/replicator_smm.cpp:81 #18 0x00007f3754324030 in galera::ReplicatorSMM::apply_trx (this=<optimized out>, recv_ctx=0x7f3740000af0, trx=0x7f3740022ff0) at galera/src/replicator_smm.cpp:470 #19 0x00007f37543247c0 in galera::ReplicatorSMM::process_trx (this=0x367f540, recv_ctx=0x7f3740000af0, trx=0x7f3740022ff0) at galera/src/replicator_smm.cpp:1056 #20 0x00007f3754301d34 in galera::GcsActionSource::dispatch (this=0x367fac8, recv_ctx=0x7f3740000af0, act=...) at galera/src/gcs_action_source.cpp:110 #21 0x00007f37543029b3 in galera::GcsActionSource::process (this=0x367fac8, recv_ctx=0x7f3740000af0) at galera/src/gcs_action_source.cpp:168 #22 0x00007f375432637d in async_recv (recv_ctx=0x7f3740000af0, this=0x367f540) at galera/src/replicator_smm.cpp:375 #23 galera::ReplicatorSMM::async_recv (this=0x367f540, recv_ctx=0x7f3740000af0) at galera/src/replicator_smm.cpp:356 #24 0x00007f3754339cae in galera_recv (gh=<optimized out>, recv_ctx=<optimized out>) at galera/src/wsrep_provider.cpp:201 #25 0x0000000000629ff0 in wsrep_replication_process (thd=0x7f3740000af0) at maria-5.5-galera/sql/sql_parse.cc:8363 #26 0x00000000005693ff in start_wsrep_THD (arg=0x629f6d) at maria-5.5-galera/sql/mysqld.cc:4832 #27 0x00007f375596ee9a in start_thread (arg=0x7f37490d6700) at pthread_create.c:308 #28 0x00007f375509fcbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #29 0x0000000000000000 in ?? () Thread 2 (Thread 0x7f374810e700 (LWP 20641)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007f3754320985 in wait (cond=..., this=<optimized out>) at galerautils/src/gu_lock.hpp:56 #2 galera::Monitor<galera::ReplicatorSMM::CommitOrder>::enter (this=0x367ffd8, obj=...) at galera/src/monitor.hpp:126 #3 0x00007f3754322584 in galera::ReplicatorSMM::to_isolation_begin (this=0x367f540, trx=0x7f3708263200) at galera/src/replicator_smm.cpp:961 #4 0x00007f375433c627 in galera_to_execute_start (gh=<optimized out>, conn_id=6, key=<optimized out>, key_len=1, query=0x7f370825f090, query_len=160, global_seqno=0x4b0c958) at galera/src/wsrep_provider.cpp:579 #5 0x0000000000775c25 in wsrep_TOI_begin (thd=0x4b0a570, db_=0x0, table_=0x0, table_list=0x7f37082f8328) at maria-5.5-galera/sql/wsrep_mysqld.cc:1091 #6 0x0000000000776a0a in wsrep_to_isolation_begin (thd=0x4b0a570, db_=0x0, table_=0x0, table_list=0x7f37082f8328) at maria-5.5-galera/sql/wsrep_mysqld.cc:1227 #7 0x000000000095e0f0 in Alter_table_statement::execute (this=0x7f37082f8ab8, thd=0x4b0a570) at maria-5.5-galera/sql/sql_alter.cc:110 #8 0x0000000000620c93 in mysql_execute_command (thd=0x4b0a570) at maria-5.5-galera/sql/sql_parse.cc:4844 #9 0x0000000000624da9 in mysql_parse (thd=0x4b0a570, rawbuf=0x7f37082f8138 "ALTER TABLE `table1000_innodb_int_autoinc` ADD COLUMN filler VARCHAR(255) DEFAULT ' filler '", length=92, parser_state=0x7f374810d550) at maria-5.5-galera/sql/sql_parse.cc:6304 #10 0x0000000000623e70 in wsrep_mysql_parse (thd=0x4b0a570, rawbuf=0x7f37082f8138 "ALTER TABLE `table1000_innodb_int_autoinc` ADD COLUMN filler VARCHAR(255) DEFAULT ' filler '", length=92, parser_state=0x7f374810d550) at maria-5.5-galera/sql/sql_parse.cc:6069 #11 0x0000000000616335 in dispatch_command (command=COM_QUERY, thd=0x4b0a570, packet=0x4b0e251 "ALTER TABLE `table1000_innodb_int_autoinc` ADD COLUMN filler VARCHAR(255) DEFAULT ' filler '", packet_length=92) at maria-5.5-galera/sql/sql_parse.cc:1245 #12 0x0000000000615114 in do_command (thd=0x4b0a570) at maria-5.5-galera/sql/sql_parse.cc:891 #13 0x000000000071ddd7 in do_handle_one_connection (thd_arg=0x4b0a570) at maria-5.5-galera/sql/sql_connect.cc:1291 #14 0x000000000071d7af in handle_one_connection (arg=0x4b0a570) at maria-5.5-galera/sql/sql_connect.cc:1199 #15 0x00007f375596ee9a in start_thread (arg=0x7f374810e700) at pthread_create.c:308 #16 0x00007f375509fcbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #17 0x0000000000000000 in ?? ()
Server command lines:
mysqld --no-defaults --basedir=/home/elenst/maria-5.5-galera --lc-messages-dir=/home/elenst/maria-5.5-galera/sql/share/ --core --datadir=/home/elenst/maria-5.5-galera/data1 --tmpdir=/home/elenst/maria-5.5-galera/data1/tmp --port=8306 --socket=/home/elenst/maria-5.5-galera/data1/tmp/node1.sock --wsrep-provider=/home/elenst/galera/libgalera_smm.so --wsrep-cluster-address=gcomm:// --binlog-format=row --wsrep-sst-method=rsync --log-error=/home/elenst/maria-5.5-galera/data1/log.err --innodb_autoinc_lock_mode=2 --innodb_locks_unsafe_for_binlog=1 --general-log=1 mysqld --no-defaults --basedir=/home/elenst/maria-5.5-galera --lc-messages-dir=/home/elenst/maria-5.5-galera/sql/share/ --core --datadir=/home/elenst/maria-5.5-galera/data2 --tmpdir=/home/elenst/maria-5.5-galera/data2/tmp --port=8307 --socket=/home/elenst/maria-5.5-galera/data2/tmp/node2.sock --wsrep-provider=/home/elenst/galera/libgalera_smm.so --wsrep-cluster-address=gcomm://127.0.0.1:4567?gmcast.listen_addr=tcp://127.0.0.1:4566 --binlog-format=row --wsrep-sst-method=rsync --log-error=/home/elenst/maria-5.5-galera/data2/log.err --innodb_autoinc_lock_mode=2 --innodb_locks_unsafe_for_binlog=1 --general-log=1
To reproduce using RQG:
# On one node: perl ./gentest.pl --gendata=conf/engines/handler.zz --threads=1 --queries=100M --duration=600 --dsn=dbi:mysql:host=127.0.0.1:port=8306:user=root:database=test --grammar=conf/engines/handler.yy --engine=innodb # On another node, after the first node creates the table and starts the flow: perl ./gentest.pl --threads=1 --queries=100M --duration=600 --dsn=dbi:mysql:host=127.0.0.1:port=8307:user=root:database=test --grammar=conf/engines/handler.yy
revision-id: seppo.jaakola@codership.com-20130216222240-syypg8kc355qcpsp revno: 3380 branch-nick: maria-5.5-galera
Gliffy Diagrams
Attachments
Activity
- All
- Comments
- Work Log
- History
- Activity
- Transitions
The safe_mutex issue has been fixed with earlier releases (cannot pinpoint which one exactly). However, this randgen handler.yy load causes unresolved MDL conflict due to aborted handler session.
This deadlock is avoided in fixes for: https://bugs.launchpad.net/codership-mysql/+bug/1233353