We're updating the issue view to help you get more done.Learn more

[PATCH] Slave disconnects and fails to reconnect on Error_code: 1159

While replicating, slave server randomly prints this error and disconnects from master:

[ERROR] Slave I/O: The slave I/O thread stops because a fatal error is encountered when it try to get the value of SERVER_ID variable from master. Error: , Error_code: 1159
[Note] Slave I/O thread exiting, read up to log 'mysql-bin.xxxxxx', position xxxxxx

Where error code 1159 is in fact ER_NET_READ_INTERRUPTED: Got timeout reading communication packets

Executing STOP SLAVE; START SLAVE; on the slave server resumes the replication without any problem. The slave server should reconnect automatically though, which doesn't happen.

I believe the issue is in mariadb-sources/sql/slave.cc

There is a function called is_network_error(), which checks if the given error is network related. It's missing a check for ER_NET_READ_INTERRUPTED. Patch is very trivial:

--- sql/slave.cc<----->2013-07-17 09:51:31.000000000 -0500
+++ sql/slave.cc<-->2014-02-19 02:06:55.591593796 -0600
@@ -1215,6 +1215,7 @@ bool is_network_error(uint errorno)
       errorno == ER_CON_COUNT_ERROR ||
       errorno == ER_CONNECTION_KILLED ||
       errorno == ER_NEW_ABORTING_CONNECTION ||
+      errorno == ER_NET_READ_INTERRUPTED ||
       errorno == ER_SERVER_SHUTDOWN)
     return TRUE;

Then mariadb will know that it was network related error and will try to reconnect automatically.



Kristian Nielsen


Tomas Matejicek