Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-7111

Unable to detect network timeout in 10.x when using SSL (regression from 5.5)

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 10.0.14, 10.0
    • Fix Version/s: 10.0
    • Component/s: Replication, SSL
    • Labels:
    • Environment:
      Ubuntu Linux 14.04 x86_64

      Description

      Summary:
      When simulating a network connectivity loss between replicating servers, if the replication channel uses SSL then MariaDB 10.x does not detect the loss of network connectivity, although 5.5 does. This is regardless of the value of the "slave_net_timeout" variable.

      Reproduced using the 10.0.14 GLIBC214 build and 10.0.0 builds, using both GTID (10.0.14) and binlog-based replication (10.0.14/10.0.0). Works as expected on 5.5.40 with binlog-based replication. I am testing with the binary .tar.gz MariaDB builds downloaded from the Mariadb servers (archive.mariadb.org).

      Steps to reproduce (functional case):

      • Set up MariaDB with two 5.5.40 servers in master-slave configuration and ensure replication is working and SSL-encrypted.
      • Start generating traffic on the master. Watch the slave status to see the traffic is being replicated successfully.
      • Simulate a network failure, e.g. "iptables -I INPUT -s <master_ip> -j DROP" on the slave. This drops all network packets from the master host.
      • Wait for slave_net_timeout seconds to pass. The slave will restart as documented, and the slave status will now state that it is attempting to reconnect to the master.

      Steps to reproduce (broken case):

      • Set up MariaDB with two 10.0.14 servers in master-slave configuration and ensure replication is working and SSL-encrypted.
      • Start generating traffic on the master. Watch the slave status to see the traffic is being replicated successfully.
      • Simulate a network failure, e.g. "iptables -I INPUT -s <master_ip> -j DROP" on the slave. This drops all network packets from the master host.
      • Wait for slave_net_timeout seconds to pass. The slave status will continue to state "waiting for master to send event", even though the log position counters are not advancing. The slave will remain in this state until the slave is stopped and restarted – it will not restart on its own, contrary to documentation. This is a change in behavior from Mariadb 5.5 and appears to be incorrect behavior.

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

              Hide
              paulkreiner Paul Kreiner added a comment -

              I also tested this with 10.0.14 as the master, and 5.5.40 as the slave. This works as expected. 10.0.14 as the slave does not work, which seems to indicate that the slave code path is what changed. I also tested with 10.0.0 as the slave, and it also does not work.

              Here is the my.cnf on the slave:
              [client]
              port = 3306
              socket = /var/run/mysqld/mysqld.sock
              ssl-ca=/etc/mysql/ssl/ca-cert.pem
              [mysqld_safe]
              socket = /var/run/mysqld/mysqld.sock
              nice = 0
              [mysqld]
              user = mysql
              pid-file = /var/run/mysqld/mysqld.pid
              socket = /var/run/mysqld/mysqld.sock
              port = 3306
              basedir = /usr/local/mysql
              datadir = /usr/local/mysql/data
              tmpdir = /tmp
              skip-external-locking
              key_buffer_size = 8M
              myisam_sort_buffer_size = 2M
              aria_pagecache_buffer_size = 64M
              aria_sort_buffer_size = 32M
              max_allowed_packet = 16M ## Default: 1M
              max_connections = 3500 ## Default: 100
              max_connect_errors = 100000 ## Default: 10 Range: 1-4294967295
              table_cache = 200 ## Default: 32
              thread_stack = 256K
              thread_cache_size = 8
              query_cache_limit = 4M ## Default: 1M
              query_cache_size = 128M ## Default: 16M
              log_error = /var/log/mysql/error.log
              log_warnings = 0
              slow_query_log = 1
              slow_query_log_file = /var/log/mysql/mysql-slow.log
              long_query_time = 10
              server-id = 1 ## Should usually match gtid-domain-id.
              log_bin = mysql-logs/bin-log
              log-bin-index = mysql-logs/bin-log.index
              master-info-file = mysql-logs/master.info
              log-slave-updates
              expire_logs_days = 14
              max_binlog_size = 100M
              auto_increment_increment = 2
              auto_increment_offset = 1
              slave-net-timeout = 6
              slave_compressed_protocol = 1
              relay-log = mysql-logs/relay-log
              relay-log-index = mysql-logs/relay-log.index
              relay-log-info-file = mysql-logs/relay-log.info
              replicate-ignore-db = mysql
              default-storage-engine = InnoDB
              innodb_file_format = barracuda
              innodb_file_per_table
              innodb_log_group_home_dir = mysql-logs/log/
              innodb_data_file_path = mysql-logs/data/ibdata1:100M:autoextend
              innodb_flush_method = O_DIRECT
              large_pages
              innodb_buffer_pool_size = 7700M
              innodb_log_file_size = 125M
              innodb_log_files_in_group = 2
              innodb_log_buffer_size = 8M
              innodb_lock_wait_timeout = 50
              innodb_flush_log_at_trx_commit = 2
              innodb_thread_concurrency = 0
              innodb_io_capacity = 3700
              innodb_write_io_threads = 8
              innodb_read_io_threads = 8
              innodb_purge_threads = 1
              innodb_stats_method = nulls_ignored
              innodb_stats_sample_pages = 128
              ssl-ca=/etc/mysql/ssl/ca-cert.pem
              ssl-cert=/etc/mysql/ssl/server-cert.pem
              ssl-key=/etc/mysql/ssl/server-key.pem
              [mysqldump]
              quick
              quote-names
              max_allowed_packet = 16M
              [mysql]
              no-auto-rehash
              [isamchk]
              key_buffer = 32M
              sort_buffer_size = 32M
              read_buffer = 4M
              write_buffer = 4M

              Here is the replication setup command. Note that we are using SSL ("require ssl" is set for the replicator user on the master). I'm not sure if that affect this behavior or not.
              CHANGE MASTER TO master_host=<master>, master_port=3306, master_ssl=1, master_ssl_ca='/etc/mysql/ssl/ca-cert.pem', master_ssl_cert='/etc/mysql/ssl/server-cert.pem', master_ssl_key='/etc/mysql/ssl/server-key.pem', master_user='replicator', master_password=<hidden>;

              Show
              paulkreiner Paul Kreiner added a comment - I also tested this with 10.0.14 as the master, and 5.5.40 as the slave. This works as expected. 10.0.14 as the slave does not work, which seems to indicate that the slave code path is what changed. I also tested with 10.0.0 as the slave, and it also does not work. Here is the my.cnf on the slave: [client] port = 3306 socket = /var/run/mysqld/mysqld.sock ssl-ca=/etc/mysql/ssl/ca-cert.pem [mysqld_safe] socket = /var/run/mysqld/mysqld.sock nice = 0 [mysqld] user = mysql pid-file = /var/run/mysqld/mysqld.pid socket = /var/run/mysqld/mysqld.sock port = 3306 basedir = /usr/local/mysql datadir = /usr/local/mysql/data tmpdir = /tmp skip-external-locking key_buffer_size = 8M myisam_sort_buffer_size = 2M aria_pagecache_buffer_size = 64M aria_sort_buffer_size = 32M max_allowed_packet = 16M ## Default: 1M max_connections = 3500 ## Default: 100 max_connect_errors = 100000 ## Default: 10 Range: 1-4294967295 table_cache = 200 ## Default: 32 thread_stack = 256K thread_cache_size = 8 query_cache_limit = 4M ## Default: 1M query_cache_size = 128M ## Default: 16M log_error = /var/log/mysql/error.log log_warnings = 0 slow_query_log = 1 slow_query_log_file = /var/log/mysql/mysql-slow.log long_query_time = 10 server-id = 1 ## Should usually match gtid-domain-id. log_bin = mysql-logs/bin-log log-bin-index = mysql-logs/bin-log.index master-info-file = mysql-logs/master.info log-slave-updates expire_logs_days = 14 max_binlog_size = 100M auto_increment_increment = 2 auto_increment_offset = 1 slave-net-timeout = 6 slave_compressed_protocol = 1 relay-log = mysql-logs/relay-log relay-log-index = mysql-logs/relay-log.index relay-log-info-file = mysql-logs/relay-log.info replicate-ignore-db = mysql default-storage-engine = InnoDB innodb_file_format = barracuda innodb_file_per_table innodb_log_group_home_dir = mysql-logs/log/ innodb_data_file_path = mysql-logs/data/ibdata1:100M:autoextend innodb_flush_method = O_DIRECT large_pages innodb_buffer_pool_size = 7700M innodb_log_file_size = 125M innodb_log_files_in_group = 2 innodb_log_buffer_size = 8M innodb_lock_wait_timeout = 50 innodb_flush_log_at_trx_commit = 2 innodb_thread_concurrency = 0 innodb_io_capacity = 3700 innodb_write_io_threads = 8 innodb_read_io_threads = 8 innodb_purge_threads = 1 innodb_stats_method = nulls_ignored innodb_stats_sample_pages = 128 ssl-ca=/etc/mysql/ssl/ca-cert.pem ssl-cert=/etc/mysql/ssl/server-cert.pem ssl-key=/etc/mysql/ssl/server-key.pem [mysqldump] quick quote-names max_allowed_packet = 16M [mysql] no-auto-rehash [isamchk] key_buffer = 32M sort_buffer_size = 32M read_buffer = 4M write_buffer = 4M Here is the replication setup command. Note that we are using SSL ("require ssl" is set for the replicator user on the master). I'm not sure if that affect this behavior or not. CHANGE MASTER TO master_host=<master>, master_port=3306, master_ssl=1, master_ssl_ca='/etc/mysql/ssl/ca-cert.pem', master_ssl_cert='/etc/mysql/ssl/server-cert.pem', master_ssl_key='/etc/mysql/ssl/server-key.pem', master_user='replicator', master_password=<hidden>;
              Hide
              paulkreiner Paul Kreiner added a comment -

              This appears to only affect the connection if SSL is used. If I set up non-encrypted replication with 10.0.14 as slave, then the net timeout is detected correctly and the reconnection happens as expected.

              Show
              paulkreiner Paul Kreiner added a comment - This appears to only affect the connection if SSL is used. If I set up non-encrypted replication with 10.0.14 as slave, then the net timeout is detected correctly and the reconnection happens as expected.
              Hide
              elenst Elena Stepanova added a comment -

              Thanks for the report.

              It turns out we inherited this regression from MySQL 5.6, particularly 5.6.3, particularly this revision:

              revno: 3134
              revision-id: davi.arnaut@oracle.com-20110531135209-8kxz4np8c4gav6s2
              parent: jimmy.yang@oracle.com-20110531093059-3x1f93rnspltp3h6
              committer: Davi Arnaut <davi.arnaut@oracle.com>
              branch nick: 11762221-trunk
              timestamp: Tue 2011-05-31 10:52:09 -0300
              message:
                Bug#11762221 - 54790: Use of non-blocking mode for sockets limits performance
                Bug#11758972 - 51244: wait_timeout fails on OpenSolaris
                
                The problem was that a optimization for the case when the server
                uses alarms for timeouts could cause a slowdown when socket
                timeouts are used instead. In case alarms are used for timeouts,
                a non-blocking read is attempted first in order to avoid the
                cost of setting up a alarm and if this non-blocking read fails,
                the socket mode is changed to blocking and a alarm is armed.
                
                If socket timeout is used, there is no point in attempting a
                non-blocking read first as the timeout will be automatically
                armed by the OS. Yet the server would attempt a non-blocking
                read first and later switch the socket to blocking mode. This
                could inadvertently impact performance as switching the blocking
                mode of a socket requires at least two calls into the kernel
                on Linux, apart from problems inherited by the scalability
                of fcntl(2).
                
                The solution is to remove alarm based timeouts from the
                protocol layer and push timeout handling down to the virtual
                I/O layer. This approach allows the handling of socket timeouts
                on a platform-specific basis. The blocking mode of the socket
                is no longer exported and VIO read and write operations either
                complete or fail with a error or timeout.
                
                On Linux, the MSG_DONTWAIT flag is used to enable non-blocking
                send and receive operations. If the operation would block,
                poll() is used to wait for readiness or until a timeout occurs.
                This strategy avoids the need to set the socket timeout and
                blocking mode twice per query.
                
                On Windows, as before, the timeout is set on a per-socket
                fashion. In all remaining operating systems, the socket is set
                to non-blocking mode and poll() is used to wait for readiness
                or until a timeout occurs.
                
                In order to cleanup the code after the removal of alarm based
                timeouts, the low level packet reading loop is unrolled into
                two specific sequences: reading the packet header and the
                payload. This makes error handling easier down the road.
                
                In conclusion, benchmarks have shown that these changes do not
                introduce any performance hits and actually slightly improves
                the server throughput for higher numbers of threads.
                
                - Incompatible changes:
                
                A timeout is now always applied to a individual receive or
                send I/O operation. In contrast, a alarm based timeout was
                applied to an entire send or receive packet operation. That
                is, before this patch the timeout was really a time limit
                for sending or reading one packet.
                
                Building and running MySQL on POSIX systems now requires
                support for poll() and O_NONBLOCK. These should be available
                in any modern POSIX system. In other words, except for Windows,
                legacy (non-POSIX) systems which only support O_NDELAY and
                select() are no longer supported.
                
                On Windows, the default value for MYSQL_OPT_CONNECT_TIMEOUT
                is no longer 20 seconds. The default value now is no timeout
                (infinite), the same as in all other platforms.
                
                Packets bigger than the maximum allowed packet size are no
                longer skipped. Before this patch, if a application sent a
                packet bigger than the maximum allowed packet size, or if
                the server failed to allocate a buffer sufficiently large
                to hold the packet, the server would keep reading the packet
                until its end. Now the session is simply disconnected if the
                server cannot handle such large packets.
                
                The client socket buffer is no longer cleared (drained)
                before sending commands to the server. Before this patch,
                any data left in the socket buffer would be drained (removed)
                before a command was sent to the server, in order to work
                around bugs where the server would violate the protocol and
                send more data. The only check left is a debug-only assertion
                to ensure that the socket buffer is empty.
              

              When a problem comes from upstream, we normally report it to them so that they are aware of it, and if they decide to fix it, we merge the fix to avoid unnecessary diversions; if they decide not to fix it, we might do it on our own.
              Are you willing to report it at bugs.mysql.com? Otherwise I can do it on your behalf.

              Show
              elenst Elena Stepanova added a comment - Thanks for the report. It turns out we inherited this regression from MySQL 5.6, particularly 5.6.3, particularly this revision: revno: 3134 revision-id: davi.arnaut@oracle.com-20110531135209-8kxz4np8c4gav6s2 parent: jimmy.yang@oracle.com-20110531093059-3x1f93rnspltp3h6 committer: Davi Arnaut <davi.arnaut@oracle.com> branch nick: 11762221-trunk timestamp: Tue 2011-05-31 10:52:09 -0300 message: Bug#11762221 - 54790: Use of non-blocking mode for sockets limits performance Bug#11758972 - 51244: wait_timeout fails on OpenSolaris The problem was that a optimization for the case when the server uses alarms for timeouts could cause a slowdown when socket timeouts are used instead. In case alarms are used for timeouts, a non-blocking read is attempted first in order to avoid the cost of setting up a alarm and if this non-blocking read fails, the socket mode is changed to blocking and a alarm is armed. If socket timeout is used, there is no point in attempting a non-blocking read first as the timeout will be automatically armed by the OS. Yet the server would attempt a non-blocking read first and later switch the socket to blocking mode. This could inadvertently impact performance as switching the blocking mode of a socket requires at least two calls into the kernel on Linux, apart from problems inherited by the scalability of fcntl(2). The solution is to remove alarm based timeouts from the protocol layer and push timeout handling down to the virtual I/O layer. This approach allows the handling of socket timeouts on a platform-specific basis. The blocking mode of the socket is no longer exported and VIO read and write operations either complete or fail with a error or timeout. On Linux, the MSG_DONTWAIT flag is used to enable non-blocking send and receive operations. If the operation would block, poll() is used to wait for readiness or until a timeout occurs. This strategy avoids the need to set the socket timeout and blocking mode twice per query. On Windows, as before, the timeout is set on a per-socket fashion. In all remaining operating systems, the socket is set to non-blocking mode and poll() is used to wait for readiness or until a timeout occurs. In order to cleanup the code after the removal of alarm based timeouts, the low level packet reading loop is unrolled into two specific sequences: reading the packet header and the payload. This makes error handling easier down the road. In conclusion, benchmarks have shown that these changes do not introduce any performance hits and actually slightly improves the server throughput for higher numbers of threads. - Incompatible changes: A timeout is now always applied to a individual receive or send I/O operation. In contrast, a alarm based timeout was applied to an entire send or receive packet operation. That is, before this patch the timeout was really a time limit for sending or reading one packet. Building and running MySQL on POSIX systems now requires support for poll() and O_NONBLOCK. These should be available in any modern POSIX system. In other words, except for Windows, legacy (non-POSIX) systems which only support O_NDELAY and select() are no longer supported. On Windows, the default value for MYSQL_OPT_CONNECT_TIMEOUT is no longer 20 seconds. The default value now is no timeout (infinite), the same as in all other platforms. Packets bigger than the maximum allowed packet size are no longer skipped. Before this patch, if a application sent a packet bigger than the maximum allowed packet size, or if the server failed to allocate a buffer sufficiently large to hold the packet, the server would keep reading the packet until its end. Now the session is simply disconnected if the server cannot handle such large packets. The client socket buffer is no longer cleared (drained) before sending commands to the server. Before this patch, any data left in the socket buffer would be drained (removed) before a command was sent to the server, in order to work around bugs where the server would violate the protocol and send more data. The only check left is a debug-only assertion to ensure that the socket buffer is empty. When a problem comes from upstream, we normally report it to them so that they are aware of it, and if they decide to fix it, we merge the fix to avoid unnecessary diversions; if they decide not to fix it, we might do it on our own. Are you willing to report it at bugs.mysql.com? Otherwise I can do it on your behalf.
              Hide
              paulkreiner Paul Kreiner added a comment -

              Thanks, Elena. I've created a bug report upstream. If you wish to chime in, you may do so here: http://bugs.mysql.com/74908

              Show
              paulkreiner Paul Kreiner added a comment - Thanks, Elena. I've created a bug report upstream. If you wish to chime in, you may do so here: http://bugs.mysql.com/74908
              Hide
              elenst Elena Stepanova added a comment -

              Thank you. I've subscribed to the report.

              Show
              elenst Elena Stepanova added a comment - Thank you. I've subscribed to the report.

                People

                • Assignee:
                  Unassigned
                  Reporter:
                  paulkreiner Paul Kreiner
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  4 Start watching this issue

                  Dates

                  • Created:
                    Updated: