Details
-
Type:
Bug
-
Status: Closed
-
Priority:
Minor
-
Resolution: Won't Fix
-
Affects Version/s: None
-
Fix Version/s: None
-
Component/s: None
-
Labels:
Description
I have a 3-node cluster. On two nodes I run 1-thread DML flow, the third node is a slave only.
Soon after the workflow starts, the nodes hang, seemingly forever.
Node 1 process list:
+----+-------------+-----------------+------+---------+------+---------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+ | Id | User | Host | db | Command | Time | State | Info | Progress | +----+-------------+-----------------+------+---------+------+---------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+ | 1 | system user | | NULL | Sleep | 98 | Write_rows_log_event::write_row(1493) | NULL | 0.000 | | 2 | system user | | NULL | Sleep | 1269 | wsrep aborter idle | NULL | 0.000 | | 10 | root | localhost:41086 | test | Query | 92 | query end | INSERT LOW_PRIORITY IGNORE INTO `table0_int_autoinc` ( `col_int_key` ) SELECT `col_int_key` FROM `table2_int_autoinc` AS X WHERE X . `col_int_key` IN ( 83 , 'l' , 1757151232 , 5 , 9 ) ORDER BY `col_char_12`,`col_char_12_key`,`col_int`,`col_int_key`,`pk` LIMIT 6 | 0.000 | | 19 | root | localhost:41101 | NULL | Query | 0 | sleeping | show full processlist | 0.000 | +----+-------------+-----------------+------+---------+------+---------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
Node 2 process list:
+----+-------------+-----------------+------+---------+------+--------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+ | Id | User | Host | db | Command | Time | State | Info | Progress | +----+-------------+-----------------+------+---------+------+--------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+ | 1 | system user | | NULL | Sleep | 151 | committed 1494 | NULL | 0.000 | | 2 | system user | | NULL | Sleep | 1296 | wsrep aborter idle | NULL | 0.000 | | 7 | root | localhost:42574 | test | Query | 157 | query end | DELETE LOW_PRIORITY QUICK FROM `table0_int_autoinc` WHERE `col_int_key` IN ( 'o' , 2 , 0 , 'm' , 'l' ) ORDER BY `col_char_12`,`col_char_12_key`,`col_int`,`col_int_key`,`pk` LIMIT 1 | 0.000 | | 9 | root | localhost:42584 | NULL | Query | 0 | sleeping | show full processlist | 0.000 | +----+-------------+-----------------+------+---------+------+--------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
Node 1 stack trace (selected threads):
Thread 21 (Thread 0x7fd40959a700 (LWP 14628)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x0000000000cfc88e in safe_cond_wait (cond=0x26bd3e0, mp=0x26bd330, file=0xeb8cb8 "maria-5.5-galera/storage/xtradb/os/os0sync.c", line=207) at maria-5.5-galera/mysys/thr_mutex.c:493 #2 0x0000000000c607a6 in os_cond_wait (cond=0x26bd3e0, mutex=0x26bd330) at maria-5.5-galera/storage/xtradb/os/os0sync.c:207 #3 0x0000000000c60fbb in os_event_wait_low (event=0x26bd330, reset_sig_count=40) at maria-5.5-galera/storage/xtradb/os/os0sync.c:609 #4 0x0000000000b32f74 in srv_suspend_mysql_thread (thr=0x7fd3c02475e0) at maria-5.5-galera/storage/xtradb/srv/srv0srv.c:1917 #5 0x0000000000b07ab2 in row_mysql_handle_errors (new_err=0x7fd409596510, trx=0x7fd4040182c8, thr=0x7fd3c02475e0, savept=0x0) at maria-5.5-galera/storage/xtradb/row/row0mysql.c:599 #6 0x0000000000b086dd in row_lock_table_autoinc_for_mysql (prebuilt=0x7fd3c0246ba8) at maria-5.5-galera/storage/xtradb/row/row0mysql.c:1063 #7 0x0000000000acca7b in ha_innobase::innobase_lock_autoinc (this=0x7fd3c0244798) at maria-5.5-galera/storage/xtradb/handler/ha_innodb.cc:6364 #8 0x0000000000accb44 in ha_innobase::innobase_set_max_autoinc (this=0x7fd3c0244798, auto_inc=1162) at maria-5.5-galera/storage/xtradb/handler/ha_innodb.cc:6415 #9 0x0000000000acd5db in ha_innobase::write_row (this=0x7fd3c0244798, record=0x7fd3c0242b18 <incomplete sequence \347>) at maria-5.5-galera/storage/xtradb/handler/ha_innodb.cc:6705 #10 0x00000000007efe87 in handler::ha_write_row (this=0x7fd3c0244798, buf=0x7fd3c0242b18 <incomplete sequence \347>) at maria-5.5-galera/sql/handler.cc:5232 #11 0x00000000008d4d79 in Rows_log_event::write_row (this=0x7fd40402b260, rli=0x7fd404010aa0, overwrite=false) at maria-5.5-galera/sql/log_event.cc:9949 #12 0x00000000008d4ee7 in Write_rows_log_event::do_exec_row (this=0x7fd40402b260, rli=0x7fd404010aa0) at maria-5.5-galera/sql/log_event.cc:10120 #13 0x00000000008d0f61 in Rows_log_event::do_apply_event (this=0x7fd40402b260, rli=0x7fd404010aa0) at maria-5.5-galera/sql/log_event.cc:8516 #14 0x0000000000593196 in Log_event::apply_event (this=0x7fd40402b260, rli=0x7fd404010aa0) at maria-5.5-galera/sql/log_event.h:1230 #15 0x000000000062948d in wsrep_apply_rbr (thd=0x7fd404000af0, rbr_buf=0x7fd404023cf0 "x1!Q\023", buf_len=0) at maria-5.5-galera/sql/sql_parse.cc:8131 #16 0x0000000000629a79 in wsrep_apply_cb (ctx=0x7fd404000af0, buf=0x7fd404023cf0, buf_len=125, global_seqno=1493) at maria-5.5-galera/sql/sql_parse.cc:8210 #17 0x00007fd4147e2abf in apply_wscoll (trx=..., apply_cb=0x6299cb <wsrep_apply_cb(void*, void const*, unsigned long, long)>, recv_ctx=0x7fd404000af0) at galera/src/replicator_smm.cpp:37 #18 apply_trx_ws (recv_ctx=0x7fd404000af0, apply_cb=0x6299cb <wsrep_apply_cb(void*, void const*, unsigned long, long)>, commit_cb=0x629cd6 <wsrep_commit_cb(void*, long, bool)>, trx=...) at galera/src/replicator_smm.cpp:81 #19 0x00007fd4147e8030 in galera::ReplicatorSMM::apply_trx (this=<optimized out>, recv_ctx=0x7fd404000af0, trx=0x7fd404023460) at galera/src/replicator_smm.cpp:470 #20 0x00007fd4147e87c0 in galera::ReplicatorSMM::process_trx (this=0x21dc510, recv_ctx=0x7fd404000af0, trx=0x7fd404023460) at galera/src/replicator_smm.cpp:1056 #21 0x00007fd4147c5d34 in galera::GcsActionSource::dispatch (this=0x21dca98, recv_ctx=0x7fd404000af0, act=...) at galera/src/gcs_action_source.cpp:110 #22 0x00007fd4147c69b3 in galera::GcsActionSource::process (this=0x21dca98, recv_ctx=0x7fd404000af0) at galera/src/gcs_action_source.cpp:168 #23 0x00007fd4147ea37d in async_recv (recv_ctx=0x7fd404000af0, this=0x21dc510) at galera/src/replicator_smm.cpp:375 #24 galera::ReplicatorSMM::async_recv (this=0x21dc510, recv_ctx=0x7fd404000af0) at galera/src/replicator_smm.cpp:356 #25 0x00007fd4147fdcae in galera_recv (gh=<optimized out>, recv_ctx=<optimized out>) at galera/src/wsrep_provider.cpp:201 #26 0x0000000000629ff0 in wsrep_replication_process (thd=0x7fd404000af0) at maria-5.5-galera/sql/sql_parse.cc:8363 #27 0x00000000005693ff in start_wsrep_THD (arg=0x629f6d) at maria-5.5-galera/sql/mysqld.cc:4832 #28 0x00007fd415e32e9a in start_thread (arg=0x7fd40959a700) at pthread_create.c:308 #29 0x00007fd415563cbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #30 0x0000000000000000 in ?? () Thread 2 (Thread 0x7fd4083b5700 (LWP 14913)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007fd4147e4985 in wait (cond=..., this=<optimized out>) at galerautils/src/gu_lock.hpp:56 #2 galera::Monitor<galera::ReplicatorSMM::CommitOrder>::enter (this=0x21dcfa8, obj=...) at galera/src/monitor.hpp:126 #3 0x00007fd4147e73b0 in galera::ReplicatorSMM::pre_commit (this=0x21dc510, trx=0x7fd3c0111320) at galera/src/replicator_smm.cpp:739 #4 0x00007fd4147ff47f in galera_pre_commit (gh=<optimized out>, conn_id=<optimized out>, trx_handle=<optimized out>, rbr_data=<optimized out>, rbr_data_len=139, flags=1, global_seqno=0x35a5128) at galera/src/wsrep_provider.cpp:396 #5 0x0000000000771833 in wsrep_run_wsrep_commit (thd=0x35a2d40, hton=0x2202160, all=false) at maria-5.5-galera/sql/wsrep_hton.cc:338 #6 0x0000000000770685 in wsrep_prepare (hton=0x2202160, thd=0x35a2d40, all=false) at maria-5.5-galera/sql/wsrep_hton.cc:119 #7 0x00000000007e7988 in ha_commit_trans (thd=0x35a2d40, all=false) at maria-5.5-galera/sql/handler.cc:1294 #8 0x0000000000730221 in trans_commit_stmt (thd=0x35a2d40) at maria-5.5-galera/sql/transaction.cc:325 #9 0x0000000000621031 in mysql_execute_command (thd=0x35a2d40) at maria-5.5-galera/sql/sql_parse.cc:4903 #10 0x0000000000624da9 in mysql_parse (thd=0x35a2d40, rawbuf=0x7fd3c0015558 "INSERT LOW_PRIORITY IGNORE INTO `table0_int_autoinc` ( `col_int_key` ) SELECT `col_int_key` FROM `table2_int_autoinc` AS X WHERE X . `col_int_key` IN ( 83 , 'l' , 1757151232 , 5 , 9 ) ORDER BY `col_ch"..., length=261, parser_state=0x7fd4083b4550) at maria-5.5-galera/sql/sql_parse.cc:6304 #11 0x0000000000623e70 in wsrep_mysql_parse (thd=0x35a2d40, rawbuf=0x7fd3c0015558 "INSERT LOW_PRIORITY IGNORE INTO `table0_int_autoinc` ( `col_int_key` ) SELECT `col_int_key` FROM `table2_int_autoinc` AS X WHERE X . `col_int_key` IN ( 83 , 'l' , 1757151232 , 5 , 9 ) ORDER BY `col_ch"..., length=261, parser_state=0x7fd4083b4550) at maria-5.5-galera/sql/sql_parse.cc:6069 #12 0x0000000000616335 in dispatch_command (command=COM_QUERY, thd=0x35a2d40, packet=0x35c7c21 "INSERT LOW_PRIORITY IGNORE INTO `table0_int_autoinc` ( `col_int_key` ) SELECT `col_int_key` FROM `table2_int_autoinc` AS X WHERE X . `col_int_key` IN ( 83 , 'l' , 1757151232 , 5 , 9 ) ORDER BY `col_ch"..., packet_length=261) at maria-5.5-galera/sql/sql_parse.cc:1245 #13 0x0000000000615114 in do_command (thd=0x35a2d40) at maria-5.5-galera/sql/sql_parse.cc:891 #14 0x000000000071ddd7 in do_handle_one_connection (thd_arg=0x35a2d40) at maria-5.5-galera/sql/sql_connect.cc:1291 #15 0x000000000071d7af in handle_one_connection (arg=0x35a2d40) at maria-5.5-galera/sql/sql_connect.cc:1199 #16 0x00007fd415e32e9a in start_thread (arg=0x7fd4083b5700) at pthread_create.c:308 #17 0x00007fd415563cbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #18 0x0000000000000000 in ?? ()
Node 2 stack trace:
Thread 2 (Thread 0x7f9b1c53e700 (LWP 14933)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007f9b287017de in _gcs_sm_enqueue_common (cond=0x7f9b1c53b980, sm=0x7f9b2a6e0010) at gcs/src/gcs_sm.h:160 #2 gcs_sm_enter (scheduled=<optimized out>, cond=0x7f9b1c53b980, sm=0x7f9b2a6e0010) at gcs/src/gcs_sm.h:238 #3 gcs_repl (conn=0x3626590, act=0x7f9b1c53bcf0, scheduled=<optimized out>) at gcs/src/gcs.c:1490 #4 0x00007f9b28753b1c in repl (scheduled=true, act=..., this=0x3620a30) at galera/src/gcs.hpp:106 #5 galera::ReplicatorSMM::replicate (this=0x3620540, trx=0x7f9ae02f6cf0) at galera/src/replicator_smm.cpp:552 #6 0x00007f9b2876b3ff in galera_pre_commit (gh=<optimized out>, conn_id=<optimized out>, trx_handle=<optimized out>, rbr_data=0x7f9ae02a7f40, rbr_data_len=107, flags=1, global_seqno=0x4aa35d8) at galera/src/wsrep_provider.cpp:388 #7 0x0000000000771833 in wsrep_run_wsrep_commit (thd=0x4aa11f0, hton=0x36468b0, all=false) at maria-5.5-galera/sql/wsrep_hton.cc:338 #8 0x0000000000770685 in wsrep_prepare (hton=0x36468b0, thd=0x4aa11f0, all=false) at maria-5.5-galera/sql/wsrep_hton.cc:119 #9 0x00000000007e7988 in ha_commit_trans (thd=0x4aa11f0, all=false) at maria-5.5-galera/sql/handler.cc:1294 #10 0x0000000000730221 in trans_commit_stmt (thd=0x4aa11f0) at maria-5.5-galera/sql/transaction.cc:325 #11 0x0000000000621031 in mysql_execute_command (thd=0x4aa11f0) at maria-5.5-galera/sql/sql_parse.cc:4903 #12 0x0000000000624da9 in mysql_parse (thd=0x4aa11f0, rawbuf=0x7f9ae02cb358 "DELETE LOW_PRIORITY QUICK FROM `table0_int_autoinc` WHERE `col_int_key` IN ( 'o' , 2 , 0 , 'm' , 'l' ) ORDER BY `col_char_12`,`col_char_12_key`,`col_int`,`col_int_key`,`pk` LIMIT 1", length=181, parser_state=0x7f9b1c53d550) at maria-5.5-galera/sql/sql_parse.cc:6304 #13 0x0000000000623e70 in wsrep_mysql_parse (thd=0x4aa11f0, rawbuf=0x7f9ae02cb358 "DELETE LOW_PRIORITY QUICK FROM `table0_int_autoinc` WHERE `col_int_key` IN ( 'o' , 2 , 0 , 'm' , 'l' ) ORDER BY `col_char_12`,`col_char_12_key`,`col_int`,`col_int_key`,`pk` LIMIT 1", length=181, parser_state=0x7f9b1c53d550) at maria-5.5-galera/sql/sql_parse.cc:6069 #14 0x0000000000616335 in dispatch_command (command=COM_QUERY, thd=0x4aa11f0, packet=0x4aa4ed1 "DELETE LOW_PRIORITY QUICK FROM `table0_int_autoinc` WHERE `col_int_key` IN ( 'o' , 2 , 0 , 'm' , 'l' ) ORDER BY `col_char_12`,`col_char_12_key`,`col_int`,`col_int_key`,`pk` LIMIT 1", packet_length=181) at maria-5.5-galera/sql/sql_parse.cc:1245 #15 0x0000000000615114 in do_command (thd=0x4aa11f0) at maria-5.5-galera/sql/sql_parse.cc:891 #16 0x000000000071ddd7 in do_handle_one_connection (thd_arg=0x4aa11f0) at maria-5.5-galera/sql/sql_connect.cc:1291 #17 0x000000000071d7af in handle_one_connection (arg=0x4aa11f0) at maria-5.5-galera/sql/sql_connect.cc:1199 #18 0x00007f9b29d9ee9a in start_thread (arg=0x7f9b1c53e700) at pthread_create.c:308 #19 0x00007f9b294cfcbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #20 0x0000000000000000 in ?? ()
Server command lines:
/home/elenst/maria-5.5-galera/sql/mysqld --no-defaults --basedir=/home/elenst/maria-5.5-galera --lc-messages-dir=/home/elenst/maria-5.5-galera/sql/share/ --core --datadir=/home/elenst/maria-5.5-galera/data1 --tmpdir=/home/elenst/maria-5.5-galera/data1/tmp --port=8306 --socket=/home/elenst/maria-5.5-galera/data1/tmp/node1.sock --wsrep-provider=/home/elenst/galera/libgalera_smm.so --wsrep-cluster-address=gcomm:// --binlog-format=row --wsrep-sst-method=rsync --log-error=/home/elenst/maria-5.5-galera/data1/log.err /home/elenst/maria-5.5-galera/sql/mysqld --no-defaults --basedir=/home/elenst/maria-5.5-galera --lc-messages-dir=/home/elenst/maria-5.5-galera/sql/share/ --core --datadir=/home/elenst/maria-5.5-galera/data2 --tmpdir=/home/elenst/maria-5.5-galera/data2/tmp --port=8307 --socket=/home/elenst/maria-5.5-galera/data2/tmp/node2.sock --wsrep-provider=/home/elenst/galera/libgalera_smm.so --wsrep-cluster-address=gcomm://127.0.0.1:4567?gmcast.listen_addr=tcp://127.0.0.1:4566 --binlog-format=row --wsrep-sst-method=rsync --log-error=/home/elenst/maria-5.5-galera/data2/log.err /home/elenst/maria-5.5-galera/sql/mysqld --no-defaults --basedir=/home/elenst/maria-5.5-galera --lc-messages-dir=/home/elenst/maria-5.5-galera/sql/share/ --core --datadir=/home/elenst/maria-5.5-galera/data3 --tmpdir=/home/elenst/maria-5.5-galera/data3/tmp --port=8308 --socket=/home/elenst/maria-5.5-galera/data2/tmp/node3.sock --wsrep-provider=/home/elenst/galera/libgalera_smm.so --wsrep-cluster-address=gcomm://127.0.0.1:4567?gmcast.listen_addr=tcp://127.0.0.1:4565 --binlog-format=row --wsrep-sst-method=rsync --log-error=/home/elenst/maria-5.5-galera/data3/log.err
To reproduce with RQG:
# Node 1: perl ./gentest.pl --gendata=conf/engines/engine_stress.zz --threads=1 --queries=100M --duration=600 --dsn=dbi:mysql:host=127.0.0.1:port=8306:user=root:database=test --grammar=conf/engines/engine_stress.yy # Node 2 (after node 1 created the tables and started the flow): perl ./gentest.pl --threads=1 --queries=100M --duration=600 --dsn=dbi:mysql:host=127.0.0.1:port=8307:user=root:database=test --grammar=conf/engines/engine_stress.yy
revision-id: seppo.jaakola@codership.com-20130216222240-syypg8kc355qcpsp revno: 3380 branch-nick: maria-5.5-galera
Built as
cmake . -DCMAKE_BUILD_TYPE=Debug && make
Gliffy Diagrams
Attachments
Activity
- All
- Comments
- Work Log
- History
- Activity
- Transitions
Attached all stack traces and error logs from both nodes (the bigger log.err is from node 1, the smaller one from node 2)