Re: Query "stop slave" hangs
Based on the data uploaded to FTP separately, I think it's a manifestation of the bug http://bugs.mysql.com/bug.php?id=45940 (part 4) and its duplicate http://bugs.mysql.com/bug.php?id=53985 which describe exactly the same problem, hanging STOP SLAVE.
The analysis in the latter bug says that it happens when SQL thread is being stopped in a middle of transaction, while IO thread has already exited, and relates to the situation when a mix of transactional and non-transactional engines is involved (so rolling back the started group is not safe).
In our case, we have all the same elements, just due to different reasons.
According to the slave error log, on the server start the IO thread exited immediately due to ER_MASTER_FATAL_ERROR_READING_BINLOG. The previous HW problem on the master can account for that.
SQL thread started, but its position ponted at the beginning of a non-finished transaction (group). So, as the bugs above describe, it finished executing what it had and started waiting for the rest, which the IO thread of course could not provide. The error log does not even show any signs of SQL thread attempting to exit when it presumably should have received the STOP command.
What for the mix of transactional and non-transactional engines, instead of it we have different table engines on master and slave. The transaction itself apparently consisted of two DML statements only (the first was written in the binlog, the second and COMMIT weren't), so there was no mix. But the slave table is Aria, while the master table is most likely InnoDB (judging by the look of the binary log). So, since the binary log is transactional, the SQL thread treats it as such, but it also raises the flag 'modified_non_transactional_table'.
I'm assigning it to Kristofer so he could confirm (or deny), and importantly decide if there is anything to be done about it in 5.2/5.3. The original bug was fixed in 5.5, according to the bug comments.
Re: Query "stop slave" hangs
Based on the data uploaded to FTP separately, I think it's a manifestation of the bug http://bugs.mysql.com/bug.php?id=45940 (part 4) and its duplicate http://bugs.mysql.com/bug.php?id=53985 which describe exactly the same problem, hanging STOP SLAVE.
The analysis in the latter bug says that it happens when SQL thread is being stopped in a middle of transaction, while IO thread has already exited, and relates to the situation when a mix of transactional and non-transactional engines is involved (so rolling back the started group is not safe).
In our case, we have all the same elements, just due to different reasons.
According to the slave error log, on the server start the IO thread exited immediately due to ER_MASTER_FATAL_ERROR_READING_BINLOG. The previous HW problem on the master can account for that.
SQL thread started, but its position ponted at the beginning of a non-finished transaction (group). So, as the bugs above describe, it finished executing what it had and started waiting for the rest, which the IO thread of course could not provide. The error log does not even show any signs of SQL thread attempting to exit when it presumably should have received the STOP command.
What for the mix of transactional and non-transactional engines, instead of it we have different table engines on master and slave. The transaction itself apparently consisted of two DML statements only (the first was written in the binlog, the second and COMMIT weren't), so there was no mix. But the slave table is Aria, while the master table is most likely InnoDB (judging by the look of the binary log). So, since the binary log is transactional, the SQL thread treats it as such, but it also raises the flag 'modified_non_transactional_table'.
I'm assigning it to Kristofer so he could confirm (or deny), and importantly decide if there is anything to be done about it in 5.2/5.3. The original bug was fixed in 5.5, according to the bug comments.