Details
-
Type:
Bug
-
Status: Closed
-
Priority:
Critical
-
Resolution: Fixed
-
Affects Version/s: 10.0.16
-
Fix Version/s: 10.0.17
-
Component/s: Replication
-
Labels:
-
Environment:NixOS on AWS
Description
I hope "Critical" is the right priority. It's a crash after all.
The crash only happen to one particular replication source (there are 8 in total). This is our production system so it's quite a hassle to run a debug build, but let me know if the information here is not enough an I'll see what I can do.
There is no *.err file. Not in the database directory, not anywhere in the filesystem.
Here's the crash log:
Feb 10 03:53:13 nyancat mysqld[28898]: 150210 3:53:13 [ERROR] mysqld got signal 11 ; Feb 10 03:53:13 nyancat mysqld[28898]: This could be because you hit a bug. It is also possible that this binary Feb 10 03:53:13 nyancat mysqld[28898]: or one of the libraries it was linked against is corrupt, improperly built, Feb 10 03:53:13 nyancat mysqld[28898]: or misconfigured. This error can also be caused by malfunctioning hardware. Feb 10 03:53:13 nyancat mysqld[28898]: To report this bug, see http://kb.askmonty.org/en/reporting-bugs Feb 10 03:53:13 nyancat mysqld[28898]: We will try our best to scrape up some info that will hopefully help Feb 10 03:53:13 nyancat mysqld[28898]: diagnose the problem, but since we have already crashed, Feb 10 03:53:13 nyancat mysqld[28898]: something is definitely wrong and this may fail. Feb 10 03:53:13 nyancat mysqld[28898]: Server version: 10.0.16-MariaDB-log Feb 10 03:53:14 nyancat mysqld[28898]: key_buffer_size=134217728 Feb 10 03:53:14 nyancat mysqld[28898]: read_buffer_size=131072 Feb 10 03:53:14 nyancat mysqld[28898]: max_used_connections=12 Feb 10 03:53:14 nyancat mysqld[28898]: max_threads=1002 Feb 10 03:53:14 nyancat mysqld[28898]: thread_count=6 Feb 10 03:53:14 nyancat mysqld[28898]: It is possible that mysqld could use up to Feb 10 03:53:14 nyancat mysqld[28898]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 4384243 K bytes of memory Feb 10 03:53:14 nyancat mysqld[28898]: Hope that's ok; if not, decrease some variables in the equation. Feb 10 03:53:14 nyancat mysqld[28898]: Thread pointer: 0x0x7f5310001268 Feb 10 03:53:14 nyancat mysqld[28898]: Attempting backtrace. You can use the following information to find out Feb 10 03:53:14 nyancat mysqld[28898]: where mysqld died. If you see no messages after this, something went Feb 10 03:53:14 nyancat mysqld[28898]: terribly wrong... Feb 10 03:53:14 nyancat mysqld[28898]: stack_bottom = 0x7f53600e6970 thread_stack 0x48000 Feb 10 03:53:14 nyancat mysqld[28898]: (my_addr_resolve failure: fork) Feb 10 03:53:14 nyancat mysqld[28898]: /nix/store/gqs2rkr5p379577px4fhpm5q8dzs4s13-mariadb-10.0.16/bin/mysqld(my_print_stacktrace+0x29) [0xb75919] Feb 10 03:53:14 nyancat mysqld[28898]: /nix/store/gqs2rkr5p379577px4fhpm5q8dzs4s13-mariadb-10.0.16/bin/mysqld(handle_fatal_signal+0x398) [0x721868] Feb 10 03:53:14 nyancat mysqld[28898]: /nix/store/gpaawfk6qvb1wdj4vrwvdgp4yqfcqqgf-glibc-2.19/lib/libpthread.so.0(+0xf3a0) [0x7f593f0853a0] Feb 10 03:53:14 nyancat mysqld[28898]: /nix/store/gpaawfk6qvb1wdj4vrwvdgp4yqfcqqgf-glibc-2.19/lib/libpthread.so.0(pthread_cond_wait+0x8f) [0x7f593f081a8f] Feb 10 03:53:14 nyancat mysqld[28898]: /nix/store/gqs2rkr5p379577px4fhpm5q8dzs4s13-mariadb-10.0.16/bin/mysqld(rpl_parallel_thread_pool::get_thread(rpl_parallel_thread**, rpl_parallel_entry*)+0xfb) [0x6cdb9b] Feb 10 03:53:14 nyancat mysqld[28898]: /nix/store/gqs2rkr5p379577px4fhpm5q8dzs4s13-mariadb-10.0.16/bin/mysqld(rpl_parallel_entry::choose_thread(rpl_group_info*, bool*, PSI_stage_info_v1*, bool)+0x22d) [0x6cfb4d] Feb 10 03:53:14 nyancat mysqld[28898]: /nix/store/gqs2rkr5p379577px4fhpm5q8dzs4s13-mariadb-10.0.16/bin/mysqld(rpl_parallel::do_event(rpl_group_info*, Log_event*, unsigned long long)+0x15a) [0x6d081a] Feb 10 03:53:14 nyancat mysqld[28898]: /nix/store/gqs2rkr5p379577px4fhpm5q8dzs4s13-mariadb-10.0.16/bin/mysqld(handle_slave_sql+0x18ca) [0x55956a] Feb 10 03:53:14 nyancat mysqld[28898]: /nix/store/gpaawfk6qvb1wdj4vrwvdgp4yqfcqqgf-glibc-2.19/lib/libpthread.so.0(+0x6f8a) [0x7f593f07cf8a] Feb 10 03:53:14 nyancat mysqld[28898]: /nix/store/gpaawfk6qvb1wdj4vrwvdgp4yqfcqqgf-glibc-2.19/lib/libc.so.6(clone+0x6d) [0x7f593d8f10bd] Feb 10 03:53:14 nyancat mysqld[28898]: Trying to get some variables. Feb 10 03:53:14 nyancat mysqld[28898]: Some pointers may be invalid and cause the dump to abort. Feb 10 03:53:14 nyancat mysqld[28898]: Query (0x0): is an invalid pointer Feb 10 03:53:14 nyancat mysqld[28898]: Connection ID (thread ID): 10 Feb 10 03:53:14 nyancat mysqld[28898]: Status: NOT_KILLED Feb 10 03:53:14 nyancat mysqld[28898]: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off, Feb 10 03:53:14 nyancat mysqld[28898]: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains Feb 10 03:53:14 nyancat mysqld[28898]: information that should help you find out what is causing the crash. Feb 10 03:53:15 nyancat systemd[1]: mysql.service: main process exited, code=exited, status=1/FAILURE Feb 10 03:53:15 nyancat systemd[1]: mysql.service: control process exited, code=exited status=1 Feb 10 03:53:15 nyancat systemd[1]: Unit mysql.service entered failed state. Feb 10 03:53:15 nyancat systemd[1]: mysql.service failed. Feb 10 03:53:15 nyancat mysqladmin[32455]: [116B blob data] Feb 10 03:53:15 nyancat mysqladmin[32455]: error: 'Can't connect to local MySQL server through socket '/tmp/mysql.sock' (111 "Connection refused")' Feb 10 03:53:15 nyancat mysqladmin[32455]: Check that mysqld is running and that the socket: '/tmp/mysql.sock' exists!
Gliffy Diagrams
Attachments
Activity
- All
- Comments
- Work Log
- History
- Activity
- Transitions
Hi,
Your error log apparently goes to syslog/messages or such, where you copied the stack trace from. Can you copy-paste the whole mysqld output, from server startup and till the crash?
Can you describe in more details your replication topology, and which of its participants crashes? Does the slave which crashes also get updated directly, or does it only receive updates from the master?
Please attach your cnf file(s) from the master and the slave involved in the crash.
After the server restarts after the crash and reconnects to the master, it says which position it restarts replication from. If you can provide the binary log covering this position and some time before, it could also help.
Thanks.