Details
-
Type:
Bug
-
Status: Open
-
Priority:
Major
-
Resolution: Unresolved
-
Affects Version/s: 10.0.15
-
Fix Version/s: 10.0
-
Component/s: Replication
-
Labels:None
-
Environment:Linux
Description
This has so far only been observed once and was not reproducible so far
(even with several clients doing transactions in parallel for days while the plugin was unloaded and then installed again every second):
150331 9:19:22 [Note] Semi-sync replication switched OFF. 150331 9:19:22 [Note] Semi-sync replication disabled on the master. 150331 9:19:22 [ERROR] mysqld got signal 11 ;
[...]
/usr/sbin/mysqld(my_print_stacktrace+0x2b)[0xb70d4b] /usr/sbin/mysqld(handle_fatal_signal+0x398)[0x7257b8] /lib64/libpthread.so.0[0x377040f710] /usr/lib64/mysql/plugin/semisync_master.so(ActiveTranx::is_tranx_end_pos(char const*, unsigned long long)+0x24)[0x7fe91a5f9fe4] /usr/lib64/mysql/plugin/semisync_master.so(ReplSemiSyncMaster::commitTrx(char const*, unsigned long long)+0x19e)[0x7fe91a5fab3e] /usr/sbin/mysqld(Trans_delegate::after_commit(THD*, bool)+0xa2)[0x69b612] /usr/sbin/mysqld(ha_commit_trans(THD*, bool)+0x222)[0x728872] /usr/sbin/mysqld(trans_commit_stmt(THD*)+0x1b)[0x6a350b] /usr/sbin/mysqld(mysql_execute_command(THD*)+0x514)[0x5d16d4] /usr/sbin/mysqld[0x5d79d2] /usr/sbin/mysqld(dispatch_command(enum_server_command, THD*, char*, unsigned int)+0x1b20)[0x5d9b90] /usr/sbin/mysqld(do_handle_one_connection(THD*)+0x453)[0x6956a3] /usr/sbin/mysqld(handle_one_connection+0x42)[0x695772] /lib64/libpthread.so.0[0x37704079d1] /lib64/libc.so.6(clone+0x6d)[0x37700e8b6d]
My blind educated guess is a race condition between plugin callbacks and plugin teardown code. More specific: I think that ReplSemiSyncMaster::commitTrx() is still registered as an after_commit callback, but either at the time it gets called the plugins transaction hash table has just been freed, or it got freed just at the "right" time between invoking the callback and actually processing it?
Gliffy Diagrams
Attachments
Activity
- All
- Comments
- Work Log
- History
- Activity
- Transitions
Looking at the semisync code again the "funny" part is that the above stack trace should never be seen with a production aka. non-debug build of mysqld?
The last two strack trace lines, before the signal handler kicks in, were:
The only place where commitTrx() calls is_tranx_end_pos is the following assert() though:
/* At this point, the binlog file and position of this transaction must have been removed from ActiveTranx. */ assert(thd_killed(NULL) || !active_tranxs_->is_tranx_end_pos(trx_wait_binlog_name, trx_wait_binlog_pos));So is_tranx_end_pos() should never be called in a non-debug build
where assert() is just defined as empty?