Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-600

LP:982872 - Aria recovery crash, or assertion `!new_page' failure in _ma_apply_redo_insert_row_head_or_tail, or assertion `page_offset >= keypage_header && page_offset <= page_length' failure in _ma_apply_redo_index

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 5.2.14, 5.3.12, 5.5
    • Fix Version/s: 5.5
    • Component/s: Storage Engine - Aria
    • Labels:

      Description

      One of our servers that was recently upgraded to MariaDB 5.3 crashed at startup:

      Apr 16 10:12:04 server1 mysqld_safe: Starting mysqld daemon with databases from /var/lib/mysql
      Apr 16 10:12:04 server1 mysqld: 120416 10:12:04 [Warning] 'master-host' is deprecated and will be removed in a future release. Please use ''CHANGE MASTER'' instead.
      Apr 16 10:12:04 server1 mysqld: 120416 10:12:04 [Note] Plugin 'BLACKHOLE' is disabled.
      Apr 16 10:12:04 server1 mysqld: 120416 10:12:04 [Note] Plugin 'FEDERATED' is disabled.
      Apr 16 10:12:04 server1 mysqld: 120416 10:12:04 [Note] mysqld: Aria engine: starting recovery
      Apr 16 10:12:09 server1 mysqld: recovered pages: 0% 10% 20% 30% 40% 50% 60% 70% 80% 90%2d2d2d d:2d:2d [ERROR] mysqld got signal 11 ;
      Apr 16 10:12:09 server1 mysqld: This could be because you hit a bug. It is also possible that this binary
      Apr 16 10:12:09 server1 mysqld: or one of the libraries it was linked against is corrupt, improperly built,
      Apr 16 10:12:09 server1 mysqld: or misconfigured. This error can also be caused by malfunctioning hardware.
      Apr 16 10:12:09 server1 mysqld: 
      Apr 16 10:12:09 server1 mysqld: To report this bug, see http://kb.askmonty.org/en/reporting-bugs
      Apr 16 10:12:09 server1 mysqld: 
      Apr 16 10:12:09 server1 mysqld: We will try our best to scrape up some info that will hopefully help
      Apr 16 10:12:09 server1 mysqld: diagnose the problem, but since we have already crashed, 
      Apr 16 10:12:09 server1 mysqld: something is definitely wrong and this may fail.
      Apr 16 10:12:09 server1 mysqld: 
      Apr 16 10:12:09 server1 mysqld: Server version: 5.3.5-MariaDB-ga-mariadb113~squeeze-log
      Apr 16 10:12:09 server1 mysqld: key_buffer_size=268435456
      Apr 16 10:12:09 server1 mysqld: read_buffer_size=131072
      Apr 16 10:12:09 server1 mysqld: max_used_connections=0
      Apr 16 10:12:09 server1 mysqld: max_threads=251
      Apr 16 10:12:09 server1 mysqld: thread_count=0
      Apr 16 10:12:09 server1 mysqld: connection_count=0
      Apr 16 10:12:09 server1 mysqld: It is possible that mysqld could use up to 
      Apr 16 10:12:09 server1 mysqld: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 811573 K  bytes of memory
      Apr 16 10:12:09 server1 mysqld: Hope that's ok; if not, decrease some variables in the equation.
      Apr 16 10:12:09 server1 mysqld: 
      Apr 16 10:12:09 server1 mysqld: Thread pointer: 0x0
      Apr 16 10:12:09 server1 mysqld: Attempting backtrace. You can use the following information to find out
      Apr 16 10:12:09 server1 mysqld: where mysqld died. If you see no messages after this, something went
      Apr 16 10:12:09 server1 mysqld: terribly wrong...
      Apr 16 10:12:09 server1 mysqld: stack_bottom = 0 thread_stack 0x40000
      Apr 16 10:12:09 server1 mysqld: /usr/sbin/mysqld(my_print_stacktrace+0x2e) [0x7f5dc446cd4e]
      Apr 16 10:12:09 server1 mysqld: /usr/sbin/mysqld(handle_fatal_signal+0x494) [0x7f5dc4197e14]
      Apr 16 10:12:09 server1 mysqld: /lib/libpthread.so.0(+0xeff0) [0x7f5dc25dcff0]
      Apr 16 10:12:09 server1 mysqld: /usr/sbin/mysqld(bmove_upp+0x10) [0x7f5dc447cc00]
      Apr 16 10:12:09 server1 mysqld: /usr/sbin/mysqld(_ma_apply_redo_index+0x5b9) [0x7f5dc42bf069]
      Apr 16 10:12:09 server1 mysqld: /usr/sbin/mysqld(+0x63fd9d) [0x7f5dc42ccd9d]
      Apr 16 10:12:09 server1 mysqld: /usr/sbin/mysqld(+0x63bde1) [0x7f5dc42c8de1]
      Apr 16 10:12:09 server1 mysqld: /usr/sbin/mysqld(+0x63f415) [0x7f5dc42cc415]
      Apr 16 10:12:09 server1 mysqld: /usr/sbin/mysqld(maria_apply_log+0x1b4) [0x7f5dc42cdbf4]
      Apr 16 10:12:09 server1 mysqld: /usr/sbin/mysqld(maria_recovery_from_log+0x4e) [0x7f5dc42ce5de]
      Apr 16 10:12:09 server1 mysqld: /usr/sbin/mysqld(+0x5f4b46) [0x7f5dc4281b46]
      Apr 16 10:12:09 server1 mysqld: /usr/sbin/mysqld(ha_initialize_handlerton(st_plugin_int*)+0x38) [0x7f5dc418a1c8]
      Apr 16 10:12:09 server1 mysqld: /usr/sbin/mysqld(+0x58bbec) [0x7f5dc4218bec]
      Apr 16 10:12:09 server1 mysqld: /usr/sbin/mysqld(plugin_init(int*, char**, int)+0x52d) [0x7f5dc421af0d]
      Apr 16 10:12:09 server1 mysqld: /usr/sbin/mysqld(+0x3e098a) [0x7f5dc406d98a]
      Apr 16 10:12:09 server1 mysqld: /usr/sbin/mysqld(main+0x1ee) [0x7f5dc40706ae]
      Apr 16 10:12:09 server1 mysqld: /lib/libc.so.6(__libc_start_main+0xfd) [0x7f5dc1aecc8d]
      Apr 16 10:12:09 server1 mysqld: /usr/sbin/mysqld(+0x2fe669) [0x7f5dc3f8b669]
      Apr 16 10:12:09 server1 mysqld: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
      Apr 16 10:12:09 server1 mysqld: information that should help you find out what is causing the crash.
      Apr 16 10:12:09 server1 mysqld: Writing a core file
      Apr 16 10:12:09 server1 mysqld_safe: mysqld from pid file /var/run/mysqld/mysqld.pid ended
      Apr 16 10:12:18 server1 /etc/init.d/mysql[3844]: 0 processes alive and '/usr/bin/mysqladmin --defaults-file=/etc/mysql/debian.cnf ping' resulted in
      Apr 16 10:12:18 server1 /etc/init.d/mysql[3844]: #007/usr/bin/mysqladmin: connect to server at 'localhost' failed
      Apr 16 10:12:18 server1 /etc/init.d/mysql[3844]: error: 'Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)'
      Apr 16 10:12:18 server1 /etc/init.d/mysql[3844]: Check that mysqld is running and that the socket: '/var/run/mysqld/mysqld.sock' exists!
      Apr 16 10:12:18 server1 /etc/init.d/mysql[3844]: 
      

      Removing /var/lib/mysql/aria* "fixed" this. I'll upload the aria* files to FTP so this can be analyzed.

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            elenst Elena Stepanova added a comment -

            Re: recovered pages: 0% 10% 20% 30% 40% 50% 60% 70% 80% 90%2d2d2d d:2d:2d [ERROR] mysqld got signal 11 ;
            Hi Dreas,

            I've tried to "recover" with 5.3.5 and your Aria logs on top of an empty database, and didn't get the crash so far. Could you please provide the schema (for starters, at least DDL, if not the data), or, if you already did it earlier in previous bug reports, point at one that we can use?

            Thanks.

            Show
            elenst Elena Stepanova added a comment - Re: recovered pages: 0% 10% 20% 30% 40% 50% 60% 70% 80% 90%2d2d2d d:2d:2d [ERROR] mysqld got signal 11 ; Hi Dreas, I've tried to "recover" with 5.3.5 and your Aria logs on top of an empty database, and didn't get the crash so far. Could you please provide the schema (for starters, at least DDL, if not the data), or, if you already did it earlier in previous bug reports, point at one that we can use? Thanks.
            Hide
            dreasvandonselaar Dreas van Donselaar added a comment -

            Re: recovered pages: 0% 10% 20% 30% 40% 50% 60% 70% 80% 90%2d2d2d d:2d:2d [ERROR] mysqld got signal 11 ;
            I uploaded the scheme (launchpad_982872_by_Dreas_scheme.sql.gz). Does that help?

            Show
            dreasvandonselaar Dreas van Donselaar added a comment - Re: recovered pages: 0% 10% 20% 30% 40% 50% 60% 70% 80% 90%2d2d2d d:2d:2d [ERROR] mysqld got signal 11 ; I uploaded the scheme (launchpad_982872_by_Dreas_scheme.sql.gz). Does that help?
            Hide
            elenst Elena Stepanova added a comment -

            Re: recovered pages: 0% 10% 20% 30% 40% 50% 60% 70% 80% 90%2d2d2d d:2d:2d [ERROR] mysqld got signal 11 ;
            Hi Dreas,

            The SQL you provided relies on a default engine for all tables. Do you have default engine Aria?

            When I load the scheme with default engine MyISAM and then initiate Aria recovery, it goes smoothly (not surprisingly as there is no single Aria table in the database). When I load it with default engine Aria and then attempt recovery, it fails, but in a different way – no crash, it just aborts right away on 0%, prints an error message, and the server shuts down.

            Show
            elenst Elena Stepanova added a comment - Re: recovered pages: 0% 10% 20% 30% 40% 50% 60% 70% 80% 90%2d2d2d d:2d:2d [ERROR] mysqld got signal 11 ; Hi Dreas, The SQL you provided relies on a default engine for all tables. Do you have default engine Aria? When I load the scheme with default engine MyISAM and then initiate Aria recovery, it goes smoothly (not surprisingly as there is no single Aria table in the database). When I load it with default engine Aria and then attempt recovery, it fails, but in a different way – no crash, it just aborts right away on 0%, prints an error message, and the server shuts down.
            Hide
            dreasvandonselaar Dreas van Donselaar added a comment -

            Re: recovered pages: 0% 10% 20% 30% 40% 50% 60% 70% 80% 90%2d2d2d d:2d:2d [ERROR] mysqld got signal 11 ;
            I've uploaded a complete database dump (including data) to FTP. Hopefully that allows to reproduce this.

            Show
            dreasvandonselaar Dreas van Donselaar added a comment - Re: recovered pages: 0% 10% 20% 30% 40% 50% 60% 70% 80% 90%2d2d2d d:2d:2d [ERROR] mysqld got signal 11 ; I've uploaded a complete database dump (including data) to FTP. Hopefully that allows to reproduce this.
            Hide
            elenst Elena Stepanova added a comment -

            Re: recovered pages: 0% 10% 20% 30% 40% 50% 60% 70% 80% 90%2d2d2d d:2d:2d [ERROR] mysqld got signal 11 ;
            Hi Dreas,

            I am getting the crash on the new data.
            I will try to create a test case or at least something of a reasonable size to reproduce it reliably, and will pass forward.

            Thank you.

            Show
            elenst Elena Stepanova added a comment - Re: recovered pages: 0% 10% 20% 30% 40% 50% 60% 70% 80% 90%2d2d2d d:2d:2d [ERROR] mysqld got signal 11 ; Hi Dreas, I am getting the crash on the new data. I will try to create a test case or at least something of a reasonable size to reproduce it reliably, and will pass forward. Thank you.
            Hide
            elenst Elena Stepanova added a comment -

            Re: recovered pages: 0% 10% 20% 30% 40% 50% 60% 70% 80% 90%2d2d2d d:2d:2d [ERROR] mysqld got signal 11 ;
            Reproducible on maria-5.2, 5.3, 5.5.

            I removed unnecessary tables, but that's about all simplification I was able to perform. The data and aria logs are still big.
            3 archive files are uploaded to FTP (in addition to what Dreas provided before):

            982872_datadir.tar.gz - already created and populated table and offending aria logs placed into the datadir. To reproduce the problem, decompress the archive and start MariaDB server on the extracted 'data' as a datadir (or run aria_read_log -a inside the datadir).

            982872_aria_logs.tar.gz - separately archived aria logs.
            982872_sql.tar.gz - the dump of the table (structure and data).
            To reproduce the problem using the dump, create a new empty datadir, start MariaDB server, load the dump, shutdown the server, place aria logs into the resulting datadir, then either start the server or run aria_read_log -a.

            On release server, it crashes as described above.
            On 5.2 debug server it aborts with the failing assertion:

            [Note] mysqld: Aria engine: starting recovery
            recovered pages: 0%mysqld: ma_blockrec.c:6355: _ma_apply_redo_insert_row_head_or_tail: Assertion `!new_page' failed.

            #6 0x00007fe752c3bd4d in _GI__assert_fail (assertion=0xdb8d72 "!new_page", file=<optimized out>, line=6355,
            function=<optimized out>) at assert.c:81
            #7 0x0000000000a0f72a in _ma_apply_redo_insert_row_head_or_tail (info=0x25f0708, lsn=42966603572, page_type=1,
            new_page=1 '\001', header=0x25e1cda "\236", <incomplete sequence \372>, data=0x25e1ce0 "", data_length=34)
            at ma_blockrec.c:6355
            #8 0x00000000009c9ba8 in exec_REDO_LOGREC_REDO_INSERT_ROW_HEAD (rec=0x7fff100855f0) at ma_recovery.c:1537
            #9 0x00000000009c7619 in display_and_apply_record (log_desc=0x1a662b8, rec=0x7fff100855f0) at ma_recovery.c:587
            #10 0x00000000009ccd30 in run_redo_phase (lsn=42966567214, lsn_end=0, apply=MARIA_LOG_APPLY) at ma_recovery.c:2695
            #11 0x00000000009c6ec6 in maria_apply_log (from_lsn=42966567214, end_lsn=0, apply=MARIA_LOG_APPLY, trace_file=0x25e1a00,
            should_run_undo_phase=1 '\001', skip_DDLs_arg=1 '\001', take_checkpoints=1 '\001', warnings_count=0x7fff10085b08)
            at ma_recovery.c:349
            #12 0x00000000009c6b44 in maria_recovery_from_log () at ma_recovery.c:239
            #13 0x0000000000956b2f in ha_maria_init (p=0x25c7718) at ha_maria.cc:3366
            #14 0x00000000007d2209 in ha_initialize_handlerton (plugin=0x25c6cf0) at handler.cc:432
            #15 0x00000000008a361a in plugin_initialize (plugin=0x25c6cf0) at sql_plugin.cc:1263
            #16 0x00000000008a3c88 in plugin_init (argc=0x1246a78, argv=0x25988e0, flags=0) at sql_plugin.cc:1451
            #17 0x0000000000680146 in init_server_components () at mysqld.cc:4089
            #18 0x0000000000680cb9 in main (argc=8, argv=0x7fff100863f8) at mysqld.cc:4560

            aria_read_log -a (debug version) also aborts with an assertion:

            aria_read_log: ma_key_recover.c:988: _ma_apply_redo_index: Assertion `page_offset >= keypage_header && page_offset <= page_length' failed.

            A release version of aria_read_log -a crashes with segmentation fault.

            Show
            elenst Elena Stepanova added a comment - Re: recovered pages: 0% 10% 20% 30% 40% 50% 60% 70% 80% 90%2d2d2d d:2d:2d [ERROR] mysqld got signal 11 ; Reproducible on maria-5.2, 5.3, 5.5. I removed unnecessary tables, but that's about all simplification I was able to perform. The data and aria logs are still big. 3 archive files are uploaded to FTP (in addition to what Dreas provided before): 982872_datadir.tar.gz - already created and populated table and offending aria logs placed into the datadir. To reproduce the problem, decompress the archive and start MariaDB server on the extracted 'data' as a datadir (or run aria_read_log -a inside the datadir). 982872_aria_logs.tar.gz - separately archived aria logs. 982872_sql.tar.gz - the dump of the table (structure and data). To reproduce the problem using the dump, create a new empty datadir, start MariaDB server, load the dump, shutdown the server, place aria logs into the resulting datadir, then either start the server or run aria_read_log -a. On release server, it crashes as described above. On 5.2 debug server it aborts with the failing assertion: [Note] mysqld: Aria engine: starting recovery recovered pages: 0%mysqld: ma_blockrec.c:6355: _ma_apply_redo_insert_row_head_or_tail: Assertion `!new_page' failed. #6 0x00007fe752c3bd4d in _ GI __assert_fail (assertion=0xdb8d72 "!new_page", file=<optimized out>, line=6355, function=<optimized out>) at assert.c:81 #7 0x0000000000a0f72a in _ma_apply_redo_insert_row_head_or_tail (info=0x25f0708, lsn=42966603572, page_type=1, new_page=1 '\001', header=0x25e1cda "\236", <incomplete sequence \372>, data=0x25e1ce0 "", data_length=34) at ma_blockrec.c:6355 #8 0x00000000009c9ba8 in exec_REDO_LOGREC_REDO_INSERT_ROW_HEAD (rec=0x7fff100855f0) at ma_recovery.c:1537 #9 0x00000000009c7619 in display_and_apply_record (log_desc=0x1a662b8, rec=0x7fff100855f0) at ma_recovery.c:587 #10 0x00000000009ccd30 in run_redo_phase (lsn=42966567214, lsn_end=0, apply=MARIA_LOG_APPLY) at ma_recovery.c:2695 #11 0x00000000009c6ec6 in maria_apply_log (from_lsn=42966567214, end_lsn=0, apply=MARIA_LOG_APPLY, trace_file=0x25e1a00, should_run_undo_phase=1 '\001', skip_DDLs_arg=1 '\001', take_checkpoints=1 '\001', warnings_count=0x7fff10085b08) at ma_recovery.c:349 #12 0x00000000009c6b44 in maria_recovery_from_log () at ma_recovery.c:239 #13 0x0000000000956b2f in ha_maria_init (p=0x25c7718) at ha_maria.cc:3366 #14 0x00000000007d2209 in ha_initialize_handlerton (plugin=0x25c6cf0) at handler.cc:432 #15 0x00000000008a361a in plugin_initialize (plugin=0x25c6cf0) at sql_plugin.cc:1263 #16 0x00000000008a3c88 in plugin_init (argc=0x1246a78, argv=0x25988e0, flags=0) at sql_plugin.cc:1451 #17 0x0000000000680146 in init_server_components () at mysqld.cc:4089 #18 0x0000000000680cb9 in main (argc=8, argv=0x7fff100863f8) at mysqld.cc:4560 aria_read_log -a (debug version) also aborts with an assertion: aria_read_log: ma_key_recover.c:988: _ma_apply_redo_index: Assertion `page_offset >= keypage_header && page_offset <= page_length' failed. A release version of aria_read_log -a crashes with segmentation fault.
            Hide
            elenst Elena Stepanova added a comment -

            Re: recovered pages: 0% 10% 20% 30% 40% 50% 60% 70% 80% 90%2d2d2d d:2d:2d [ERROR] mysqld got signal 11 ;
            Since the crash is reproducible on MariaDB 5.2.10, apparently it is not related to the upgrade, but either caused by a corrupted Aria log, or some tricky workflow which produced the unreadable log. Unfortunately, without the server logs it's hard to guess what kind of workflow it could be – if the contents of Aria logs gives any hints, please let me know, and I'll try to generate something.

            Show
            elenst Elena Stepanova added a comment - Re: recovered pages: 0% 10% 20% 30% 40% 50% 60% 70% 80% 90%2d2d2d d:2d:2d [ERROR] mysqld got signal 11 ; Since the crash is reproducible on MariaDB 5.2.10, apparently it is not related to the upgrade, but either caused by a corrupted Aria log, or some tricky workflow which produced the unreadable log. Unfortunately, without the server logs it's hard to guess what kind of workflow it could be – if the contents of Aria logs gives any hints, please let me know, and I'll try to generate something.
            Hide
            ratzpo Rasmus Johansson added a comment -

            Launchpad bug id: 982872

            Show
            ratzpo Rasmus Johansson added a comment - Launchpad bug id: 982872

              People

              • Assignee:
                monty Michael Widenius
                Reporter:
                dreasvandonselaar Dreas van Donselaar
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated: