Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5691

Upgrade to 5.5.35 from .34 doesnt complete wsrep connection timeout issue.

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Not a Bug
    • Affects Version/s: 5.5.35-galera
    • Fix Version/s: None
    • Component/s: None
    • Labels:
    • Environment:
      Debian7.

      Description

      Hi I get an error at the upgrade, from last stable. Everything has been working fine before upgrade so its not firewall related. Here is my system:

      dpkg -l |grep maria
      ii  libmariadbclient18               5.5.35+maria-1~wheezy         amd64        MariaDB database client library
      ii  libmysqlclient18                 5.5.35+maria-1~wheezy         amd64        Virtual package to satisfy external depends
      ii  mariadb-client-5.5               5.5.35+maria-1~wheezy         amd64        MariaDB database client binaries
      ii  mariadb-client-core-5.5          5.5.35+maria-1~wheezy         amd64        MariaDB database core client binaries
      ii  mariadb-common                   5.5.35+maria-1~wheezy         all          MariaDB database common files (e.g. /etc/mysql/conf.d/mariadb.cnf)
      iF  mariadb-galera-server-5.5        5.5.35+maria-1~wheezy         amd64        MariaDB database server with Galera cluster binaries
      rc  mariadb-server-5.5               5.5.34+maria-1~wheezy         amd64        MariaDB database server binaries
      ii  mysql-common                     5.5.35+maria-1~wheezy         all          MariaDB database common files (e.g. /etc/mysql/my.cnf)
      

      and here is the error:

      Feb 17 11:50:57 mysql05 mysqld: 140217 11:50:57 [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.50212S), skipping check
      Feb 17 11:51:22 mysql05 /etc/init.d/mysql[2685]: 0 processes alive and '/usr/bin/mysqladmin --defaults-file=/etc/mysql/debian.cnf ping' resulted in
      Feb 17 11:51:22 mysql05 /etc/init.d/mysql[2685]: #007/usr/bin/mysqladmin: connect to server at 'localhost' failed
      Feb 17 11:51:22 mysql05 /etc/init.d/mysql[2685]: error: 'Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (111)'
      Feb 17 11:51:22 mysql05 /etc/init.d/mysql[2685]: Check that mysqld is running and that the socket: '/var/run/mysqld/mysqld.sock' exists!
      Feb 17 11:51:22 mysql05 /etc/init.d/mysql[2685]: 
      Feb 17 11:51:26 mysql05 mysqld: 140217 11:51:26 [Note] WSREP: view((empty))
      Feb 17 11:51:26 mysql05 mysqld: 140217 11:51:26 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
      Feb 17 11:51:26 mysql05 mysqld: #011 at gcomm/src/pc.cpp:connect():141
      Feb 17 11:51:26 mysql05 mysqld: 140217 11:51:26 [ERROR] WSREP: gcs/src/gcs_core.c:gcs_core_open():196: Failed to open backend connection: -110 (Connection timed out)
      Feb 17 11:51:26 mysql05 mysqld: 140217 11:51:26 [ERROR] WSREP: gcs/src/gcs.c:gcs_open():1291: Failed to open channel 'cluster' at 'gcomm://192.168.5.98': -110 (Connection timed out)
      Feb 17 11:51:26 mysql05 mysqld: 140217 11:51:26 [ERROR] WSREP: gcs connect failed: Connection timed out
      Feb 17 11:51:26 mysql05 mysqld: 140217 11:51:26 [ERROR] WSREP: wsrep::connect() failed: 7
      Feb 17 11:51:26 mysql05 mysqld: 140217 11:51:26 [ERROR] Aborting
      Feb 17 11:51:26 mysql05 mysqld: 
      Feb 17 11:51:26 mysql05 mysqld: 140217 11:51:26 [Note] WSREP: Service disconnected.
      Feb 17 11:51:27 mysql05 mysqld: 140217 11:51:27 [Note] WSREP: Some threads may fail to exit.
      Feb 17 11:51:27 mysql05 mysqld: 140217 11:51:27 [Note] /usr/sbin/mysqld: Shutdown complete
      Feb 17 11:51:27 mysql05 mysqld: 
      Feb 17 11:51:27 mysql05 mysqld_safe: mysqld from pid file /var/run/mysqld/mysqld.pid ended
      

      And telnet:

      telnet 192.168.5.98 4567
      Trying 192.168.5.98...
      Connected to 192.168.5.98.
      Escape character is '^]'.
      $1C������u}d��A�Bq�� ^C
      Connection closed by foreign host.
      

      Update and going back doesnt work:

      apt-get install mariadb-galera-server-5.5=5.5.34+maria-1~wheezy mariadb-client-5.5=5.5.34+maria-1~wheezy libmysqlclient18=5.5.34+maria-1~wheezy mysql-common=5.5.34+maria-1~wheezy libmariadbclient18=5.5.34+maria-1~wheezy mariadb-client-core-5.5=5.5.34+maria-1~wheezy mariadb-common=5.5.34+maria-1~wheezy
      Reading package lists... Done
      Building dependency tree       
      Reading state information... Done
      E: Version '5.5.34+maria-1~wheezy' for 'mariadb-galera-server-5.5' was not found
      E: Version '5.5.34+maria-1~wheezy' for 'mariadb-client-5.5' was not found
      E: Version '5.5.34+maria-1~wheezy' for 'libmysqlclient18' was not found
      E: Version '5.5.34+maria-1~wheezy' for 'mysql-common' was not found
      E: Version '5.5.34+maria-1~wheezy' for 'libmariadbclient18' was not found
      E: Version '5.5.34+maria-1~wheezy' for 'mariadb-client-core-5.5' was not found
      E: Version '5.5.34+maria-1~wheezy' for 'mariadb-common' was not found
      

      Another issue from simply doing an upgrade. This is not good

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            nirbhay_c Nirbhay Choubey added a comment -

            It might happen because of mismatch in checksum algorithms being used at different nodes. 5.5.34 uses
            galera-2 (25.2.xx) which has socket.checksum=1, while 5.5.35 uses galera-3 (25.3.xx) where default
            socket.checksum is 2. So, unless one has changed the default settings, there is a mismatch.

            Could you try appending "socket.checksum=1" to --wsrep-provider-options while starting 5.5.35 node?

            Show
            nirbhay_c Nirbhay Choubey added a comment - It might happen because of mismatch in checksum algorithms being used at different nodes. 5.5.34 uses galera-2 (25.2.xx) which has socket.checksum=1, while 5.5.35 uses galera-3 (25.3.xx) where default socket.checksum is 2. So, unless one has changed the default settings, there is a mismatch. Could you try appending "socket.checksum=1" to --wsrep-provider-options while starting 5.5.35 node?
            Hide
            pcrmk Paul Cormack added a comment -

            Added wsrep-provider-options = "socket.checksum=1" and am seeing the same errors in syslog.

            Show
            pcrmk Paul Cormack added a comment - Added wsrep-provider-options = "socket.checksum=1" and am seeing the same errors in syslog.
            Hide
            nirbhay_c Nirbhay Choubey added a comment -

            Hi Paul,
            I believe you 'appended' this option alongside the other existing ones, something like :
            wsrep_provider_options="socket.ssl_cert = /etc/mysql/ssl/cert.pem; socket.ssl_key = /etc/mysql/ssl/key.pem; socket.checksum=1"
            What happens when you try with Galera-2 (version 25.2.xx) ?

            Show
            nirbhay_c Nirbhay Choubey added a comment - Hi Paul, I believe you 'appended' this option alongside the other existing ones, something like : wsrep_provider_options="socket.ssl_cert = /etc/mysql/ssl/cert.pem; socket.ssl_key = /etc/mysql/ssl/key.pem; socket.checksum=1" What happens when you try with Galera-2 (version 25.2.xx) ?
            Hide
            pcrmk Paul Cormack added a comment -

            Apologies, I had a typo on that appended line.

            Adding socket.checksum=1 did indeed allow the node to connect to 5.5.34 nodes.

            Show
            pcrmk Paul Cormack added a comment - Apologies, I had a typo on that appended line. Adding socket.checksum=1 did indeed allow the node to connect to 5.5.34 nodes.
            Hide
            stefane Stefan Eriksson added a comment -

            Nice seem to have found the issue, topics like these below, incompatibilites between versions should be listed in the changelog/announcement. I bet more or less everyone using galera is upgrading one node at a time. and if node 1 fails your not so keen on upgrading the rest aswell.

            Show
            stefane Stefan Eriksson added a comment - Nice seem to have found the issue, topics like these below, incompatibilites between versions should be listed in the changelog/announcement. I bet more or less everyone using galera is upgrading one node at a time. and if node 1 fails your not so keen on upgrading the rest aswell.

              People

              • Assignee:
                nirbhay_c Nirbhay Choubey
                Reporter:
                stefane Stefan Eriksson
              • Votes:
                2 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: