We're updating the issue view to help you get more done.Learn more

Deadlock in RESET MASTER

This failure is a race that is quite rare, but I can usually repeat it with
eg.

./mtr rpl.rpl_gtid_basic --repeat=100 --parallel=6

The problem is a deadlock between MYSQL_BIN_LOG::reset_logs() and
MYSQL_BIN_LOG::mark_xid_done(). The former takes LOCK_log and waits for the
latter to complete. But the latter also tries to take LOCK_log; this can lead
to a deadlock.

There is already code that tries to deal with this, with the flag
reset_master_pending. However, there was still a small opportunity for
deadlock, when an previous mark_xid_done() is still running when reset_logs()
is called and is at the precise point where it first releases LOCK_xid_list
and then re-aquires both LOCK_log and LOCK_xid_list.

Proposed solution: set reset_master_pending in reset_logs() before taking
LOCK_log. And also count how many invocations of LOCK_xid_list are in the
progress of releasing and re-aquiring locks, and in reset_logs() wait for that
number to drop to zero after setting reset_master_pending and before taking
LOCK_log.

Status

Assignee

Kristian Nielsen

Reporter

Kristian Nielsen