Details
-
Type:
Bug
-
Status: Closed
-
Priority:
Major
-
Resolution: Fixed
-
Affects Version/s: 5.5.29-galera
-
Fix Version/s: 5.5.34-galera
-
Component/s: None
-
Labels:
-
Environment:Ubuntu 12.04 (debs from http://ftp.heanet.ie/mirrors/mariadb/repo/5.5/ubuntu precise main)
Description
3 node Galera cluster - using Rsync SST. Runs fine for a few days - then does this in the middle of the night - with no load on the server - loads of RAM - no swapping:
May 1 01:54:38 site-db2 mysqld: 130501 1:54:38 [ERROR] mysqld got signal 11 ;
May 1 01:54:38 site-db2 mysqld: This could be because you hit a bug. It is also possible that this binary
May 1 01:54:38 site-db2 mysqld: or one of the libraries it was linked against is corrupt, improperly built,
May 1 01:54:38 site-db2 mysqld: or misconfigured. This error can also be caused by malfunctioning hardware.
May 1 01:54:38 site-db2 mysqld:
May 1 01:54:38 site-db2 mysqld: To report this bug, see http://kb.askmonty.org/en/reporting-bugs
May 1 01:54:38 site-db2 mysqld:
May 1 01:54:38 site-db2 mysqld: We will try our best to scrape up some info that will hopefully help
May 1 01:54:38 site-db2 mysqld: diagnose the problem, but since we have already crashed,
May 1 01:54:38 site-db2 mysqld: something is definitely wrong and this may fail.
May 1 01:54:38 site-db2 mysqld:
May 1 01:54:38 site-db2 mysqld: Server version: 5.5.29-MariaDB-mariadb1~precise
May 1 01:54:38 site-db2 mysqld: key_buffer_size=268435456
May 1 01:54:38 site-db2 mysqld: read_buffer_size=131072
May 1 01:54:38 site-db2 mysqld: max_used_connections=6
May 1 01:54:38 site-db2 mysqld: max_threads=802
May 1 01:54:38 site-db2 mysqld: thread_count=3
May 1 01:54:38 site-db2 mysqld: It is possible that mysqld could use up to
May 1 01:54:38 site-db2 mysqld: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2018916 K bytes of memory
May 1 01:54:38 site-db2 mysqld: Hope that's ok; if not, decrease some variables in the equation.
May 1 01:54:38 site-db2 mysqld:
May 1 01:54:38 site-db2 mysqld: Thread pointer: 0x0x0
May 1 01:54:38 site-db2 mysqld: Attempting back2trace. You can use the following information to find out
May 1 01:54:38 site-db2 mysqld: where mysqld died. If you see no messages after this, something went
May 1 01:54:38 site-db2 mysqld: terribly wrong...
May 1 01:54:38 site-db2 mysqld: stack_bottom = 0x0 thread_stack 0x40000
May 1 01:54:38 site-db2 mysqld: :0()[0x7fc4fd0682fb]
May 1 01:54:38 site-db2 mysqld: :0()[0x7fc4fcc8daa1]
May 1 01:54:38 site-db2 mysqld: :0()[0x7fc4fb501cb0]
May 1 01:54:38 site-db2 mysqld: :0()[0x7fc4f856ec65]
May 1 01:54:38 site-db2 mysqld: :0()[0x7fc4f856ee39]
May 1 01:54:38 site-db2 mysqld: :0()[0x7fc4f8570438]
May 1 01:54:38 site-db2 mysqld: :0()[0x7fc4f8627d2e]
May 1 01:54:38 site-db2 mysqld: :0()[0x7fc4f862d8b7]
May 1 01:54:38 site-db2 mysqld: :0()[0x7fc4f8633181]
May 1 01:54:38 site-db2 mysqld: :0()[0x7fc4fb4f9e9a]
May 1 01:54:38 site-db2 mysqld: :0()[0x7fc4fac2accd]
May 1 01:54:38 site-db2 mysqld: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
May 1 01:54:38 site-db2 mysqld: information that should help you find out what is causing the crash.
May 1 01:54:38 site-db2 mysqld_safe: Number of processes running now: 0
May 1 01:54:38 site-db2 mysqld_safe: WSREP: not restarting wsrep node automatically
May 1 01:54:38 site-db2 mysqld_safe: mysqld from pid file /var/run/mysqld/mysqld.pid ended
Any help greatly appreciated!
Tim
Gliffy Diagrams
Attachments
Activity
- All
- Comments
- Work Log
- History
- Activity
- Transitions
Hi Tim,
As you can see yourself, there's not much here to look at so far, so we'll need some additional information.
Is it a production server? Does it have high load during the day?
Do all nodes crash, or is it always the same one?
Does the crashing node serve as a master, or is it a pure slave (in other words, does it receive queries from client connections, even if it's only maintenance job, or only through replication)?
If possible, please enable general log ( SET GLOBAL general_log=1; ) on the crashing node(s) for a while (until the crash), so we will at least see what it was doing, if the crashing statement came from outside. You can see or configure the general log's location and name through general_log_file variable.
Please also preserve binary logs of all nodes from around the time of the crash, so we could later compare them with the general log.
When you get the crash again, please pack and upload the general log, the error log (or the related fragment of syslog if that's what you're using), the binary logs and my.cnf to ftp.askmonty.org/private (please include
MDEV-4464somewhere in the archive name so we could find it), and let us know through a comment here.Thanks!