Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6005

MariaDB 10.0.10 crashing within 10 minutes on CentOS 6.5 (with UDF from lib_mysqludf_preg)

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Won't Fix
    • Affects Version/s: 10.0.10
    • Fix Version/s: 10.0.11
    • Component/s: None
    • Labels:
    • Environment:
      - Dell R810 server
      - CentOS 6.5 (fully update)
      - 2x Intel Xeon X7560 (2.26 GHz - 45 nm. Beckton) - 16 cores / 32 threads
      - 128 GB DDR3-1333 ECC
      - 5x Dell 146GB 15k rpm SAS (RAID 0+1 + hot spare) on Dell PERC H800

      Description

      We've upgraded our production server to MariaDB 10.0.10 today after succesfull tests on our development server.

      On our production servers MariaDB crashes within 10 minutes. After crashing 7 times within 10 minutes we rolled back to 5.5. Since it's our production server, I can't do any more debugging.

      I didn't see any specific query or any other specific thing that caused the crashes.

      Please note that 5.5 runs rock solid on this server.

      The MySQL error log part for three crashes is attached.

      If you need more information, please let me know.

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            elenst Elena Stepanova added a comment - - edited

            Hi,

            Does your development server have lib_mysqludf_preg.so, and of the same version, and use its functions?
            Two of the three crashes that you attached happened in the library, and the third is likely to be related also.
            So, I've installed the library (current code from git), and I am also getting a number of crashes with it. I will dig more, but for now the question is whether you really need it – and if so, maybe your development server has a better one?

            Show
            elenst Elena Stepanova added a comment - - edited Hi, Does your development server have lib_mysqludf_preg.so, and of the same version, and use its functions? Two of the three crashes that you attached happened in the library, and the third is likely to be related also. So, I've installed the library (current code from git), and I am also getting a number of crashes with it. I will dig more, but for now the question is whether you really need it – and if so, maybe your development server has a better one?
            Hide
            koenc Koen Crijns added a comment - - edited

            Hi Elena,

            Our development server has the exact same plugin. I just found out that MariaDB has builtin PCRE functionality that does exactly what we need (and a LOT faster). I'll try again tonight to install 10.0.10 on the production server without the plugin.

            Show
            koenc Koen Crijns added a comment - - edited Hi Elena, Our development server has the exact same plugin. I just found out that MariaDB has builtin PCRE functionality that does exactly what we need (and a LOT faster). I'll try again tonight to install 10.0.10 on the production server without the plugin.
            Hide
            elenst Elena Stepanova added a comment - - edited

            With lib_mysqludf_preg from https://github.com/mysqludf/lib_mysqludf_preg.git, I'm getting different failures – crashes, oom and valgrind erros – on a release build (not exactly the same stack as provided, but it might be the matter of the query); but on the debug build, I'm mostly getting one assertion failure as below.
            All bad things only happen on a server built with jemalloc (WITH_JEMALLOC=yes; same was used for release bintar). If I force WITH_JEMALLOC=no, no problems so far.

            I'm not sure whether it's the problem of the udf, jemalloc, or MariaDB server.

            CREATE FUNCTION preg_capture RETURNS STRING SONAME 'lib_mysqludf_preg.so';
            select PREG_CAPTURE( '/(fox)/' , 'The brown fox' );
            
            <jemalloc>: extra/jemalloc/include/jemalloc/internal/arena.h:761: Failed assertion: "((uintptr_t)ptr - ((uintptr_t)run + (uintptr_t)bin_info->reg0_offset)) % bin_info->reg_interval == 0"
            
            #5  0x00007fb9d12966f0 in *__GI_abort () at abort.c:92
            #6  0x0000000000ebf7c5 in jemalloc_internal_arena_ptr_small_binind_get (ptr=0x7fb9a4c125f0, mapbits=4225) at /home/elenst/bzr/10.0/extra/jemalloc/include/jemalloc/internal/arena.h:759
            #7  0x0000000000ec040a in jemalloc_internal_arena_salloc (ptr=0x7fb9a4c125f0, demote=false) at /home/elenst/bzr/10.0/extra/jemalloc/include/jemalloc/internal/arena.h:990
            #8  0x0000000000eb7c6a in jemalloc_internal_isalloc (ptr=0x7fb9a4c125f0, demote=false) at include/jemalloc/inter
            nal/jemalloc_internal.h:865
            #9  0x0000000000ebc3e1 in free (ptr=0x7fb9a4c125f0) at /home/elenst/bzr/10.0/extra/jemalloc/src/jemalloc.c:1267
            #10 0x00007fb9d105d4a8 in pregMoveToReturnValues (initid=initid@entry=0x7fb9a4c13430, length=length@entry=0x7fb9d3150ec0, is_null=is_null@entry=0x7fb9d3150ecf "", error=error@entry=0x7fb9a4c13460 "", s=0x7fb9a4c125f0 "fox", s_len=<optimized out>) at preg.c:515
            #11 0x00007fb9d105e5c6 in preg_capture (initid=0x7fb9a4c13430, args=0x7fb9a4c133f0, result=0x7fb9d3151070 "", length=0x7fb9d3150ec0, is_null=0x7fb9d3150ecf "", error=0x7fb9a4c13460 "") at lib_mysqludf_preg_capture.c:284
            #12 0x00000000008cad57 in udf_handler::val_str (this=0x7fb9a4c133e0, str=0x7fb9d3151010, save_str=0x7fb9a4c13338) at /home/elenst/bzr/10.0/sql/item_func.cc:3719
            #13 0x00000000008cb80f in Item_func_udf_str::val_str (this=0x7fb9a4c13320, str=0x7fb9d3151010) at /home/elenst/bzr/10.0/sql/item_func.cc:3913
            #14 0x0000000000880329 in Item::send (this=0x7fb9a4c13320, protocol=0x7fb9b33bb5f8, buffer=0x7fb9d3151010) at /home/elenst/bzr/10.0/sql/item.cc:6595
            #15 0x00000000005cdd28 in Protocol::send_result_set_row (this=0x7fb9b33bb5f8, row_items=0x7fb9b33bf578) at /home/elenst/bzr/10.0/sql/protocol.cc:900
            #16 0x000000000063a60f in select_send::send_data (this=0x7fb9a4c13550, items=...) at /home/elenst/bzr/10.0/sql/sql_class.cc:2543
            #17 0x00000000006adac4 in JOIN::exec_inner (this=0x7fb9a4c13570) at /home/elenst/bzr/10.0/sql/sql_select.cc:2441
            #18 0x00000000006ad4e4 in JOIN::exec (this=0x7fb9a4c13570) at /home/elenst/bzr/10.0/sql/sql_select.cc:2355
            #19 0x00000000006b087b in mysql_select (thd=0x7fb9b33bb070, rref_pointer_array=0x7fb9b33bf6d8, tables=0x0, wild_num=0, fields=..., conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147748608, result=0x7fb9a4c13550, unit=0x7fb9b33bed78, select_lex=0x7fb9b33bf460) at /home/elenst/bzr/10.0/sql/sql_select.cc:3293
            #20 0x00000000006a6f93 in handle_select (thd=0x7fb9b33bb070, lex=0x7fb9b33becb0, result=0x7fb9a4c13550, setup_tables_done_option=0) at /home/elenst/bzr/10.0/sql/sql_select.cc:372
            #21 0x000000000067bd51 in execute_sqlcom_select (thd=0x7fb9b33bb070, all_tables=0x0) at /home/elenst/bzr/10.0/sql/sql_parse.cc:5306
            #22 0x000000000067411c in mysql_execute_command (thd=0x7fb9b33bb070) at /home/elenst/bzr/10.0/sql/sql_parse.cc:2590
            #23 0x000000000067e4db in mysql_parse (thd=0x7fb9b33bb070, rawbuf=0x7fb9a4c13088 "select PREG_CAPTURE( '/(fox)/' , 'The brown fox' )", length=50, parser_state=0x7fb9d3152610) at /home/elenst/bzr/10.0/sql/sql_parse.cc:6452
            #24 0x0000000000671294 in dispatch_command (command=COM_QUERY, thd=0x7fb9b33bb070, packet=0x7fb9b36a0071 "", packet_length=50) at /home/elenst/bzr/10.0/sql/sql_parse.cc:1308
            #25 0x0000000000670636 in do_command (thd=0x7fb9b33bb070) at /home/elenst/bzr/10.0/sql/sql_parse.cc:1005
            #26 0x000000000078b46e in do_handle_one_connection (thd_arg=0x7fb9b33bb070) at /home/elenst/bzr/10.0/sql/sql_connect.cc:1379
            #27 0x000000000078b1c1 in handle_one_connection (arg=0x7fb9b33bb070) at /home/elenst/bzr/10.0/sql/sql_connect.cc:1293
            #28 0x0000000000a30f90 in pfs_spawn_thread (arg=0x7fb9b33e6170) at /home/elenst/bzr/10.0/storage/perfschema/pfs.cc:1853
            #29 0x00007fb9d2e30b50 in start_thread (arg=<optimized out>) at pthread_create.c:304
            #30 0x00007fb9d133ba7d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
            

            No failures on 5.5, even with jemalloc.

            Show
            elenst Elena Stepanova added a comment - - edited With lib_mysqludf_preg from https://github.com/mysqludf/lib_mysqludf_preg.git , I'm getting different failures – crashes, oom and valgrind erros – on a release build (not exactly the same stack as provided, but it might be the matter of the query); but on the debug build, I'm mostly getting one assertion failure as below. All bad things only happen on a server built with jemalloc (WITH_JEMALLOC=yes; same was used for release bintar). If I force WITH_JEMALLOC=no, no problems so far. I'm not sure whether it's the problem of the udf, jemalloc, or MariaDB server. CREATE FUNCTION preg_capture RETURNS STRING SONAME 'lib_mysqludf_preg.so'; select PREG_CAPTURE( '/(fox)/' , 'The brown fox' ); <jemalloc>: extra/jemalloc/include/jemalloc/internal/arena.h:761: Failed assertion: "((uintptr_t)ptr - ((uintptr_t)run + (uintptr_t)bin_info->reg0_offset)) % bin_info->reg_interval == 0" #5 0x00007fb9d12966f0 in *__GI_abort () at abort.c:92 #6 0x0000000000ebf7c5 in jemalloc_internal_arena_ptr_small_binind_get (ptr=0x7fb9a4c125f0, mapbits=4225) at /home/elenst/bzr/10.0/extra/jemalloc/include/jemalloc/internal/arena.h:759 #7 0x0000000000ec040a in jemalloc_internal_arena_salloc (ptr=0x7fb9a4c125f0, demote=false) at /home/elenst/bzr/10.0/extra/jemalloc/include/jemalloc/internal/arena.h:990 #8 0x0000000000eb7c6a in jemalloc_internal_isalloc (ptr=0x7fb9a4c125f0, demote=false) at include/jemalloc/inter nal/jemalloc_internal.h:865 #9 0x0000000000ebc3e1 in free (ptr=0x7fb9a4c125f0) at /home/elenst/bzr/10.0/extra/jemalloc/src/jemalloc.c:1267 #10 0x00007fb9d105d4a8 in pregMoveToReturnValues (initid=initid@entry=0x7fb9a4c13430, length=length@entry=0x7fb9d3150ec0, is_null=is_null@entry=0x7fb9d3150ecf "", error=error@entry=0x7fb9a4c13460 "", s=0x7fb9a4c125f0 "fox", s_len=<optimized out>) at preg.c:515 #11 0x00007fb9d105e5c6 in preg_capture (initid=0x7fb9a4c13430, args=0x7fb9a4c133f0, result=0x7fb9d3151070 "", length=0x7fb9d3150ec0, is_null=0x7fb9d3150ecf "", error=0x7fb9a4c13460 "") at lib_mysqludf_preg_capture.c:284 #12 0x00000000008cad57 in udf_handler::val_str (this=0x7fb9a4c133e0, str=0x7fb9d3151010, save_str=0x7fb9a4c13338) at /home/elenst/bzr/10.0/sql/item_func.cc:3719 #13 0x00000000008cb80f in Item_func_udf_str::val_str (this=0x7fb9a4c13320, str=0x7fb9d3151010) at /home/elenst/bzr/10.0/sql/item_func.cc:3913 #14 0x0000000000880329 in Item::send (this=0x7fb9a4c13320, protocol=0x7fb9b33bb5f8, buffer=0x7fb9d3151010) at /home/elenst/bzr/10.0/sql/item.cc:6595 #15 0x00000000005cdd28 in Protocol::send_result_set_row (this=0x7fb9b33bb5f8, row_items=0x7fb9b33bf578) at /home/elenst/bzr/10.0/sql/protocol.cc:900 #16 0x000000000063a60f in select_send::send_data (this=0x7fb9a4c13550, items=...) at /home/elenst/bzr/10.0/sql/sql_class.cc:2543 #17 0x00000000006adac4 in JOIN::exec_inner (this=0x7fb9a4c13570) at /home/elenst/bzr/10.0/sql/sql_select.cc:2441 #18 0x00000000006ad4e4 in JOIN::exec (this=0x7fb9a4c13570) at /home/elenst/bzr/10.0/sql/sql_select.cc:2355 #19 0x00000000006b087b in mysql_select (thd=0x7fb9b33bb070, rref_pointer_array=0x7fb9b33bf6d8, tables=0x0, wild_num=0, fields=..., conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147748608, result=0x7fb9a4c13550, unit=0x7fb9b33bed78, select_lex=0x7fb9b33bf460) at /home/elenst/bzr/10.0/sql/sql_select.cc:3293 #20 0x00000000006a6f93 in handle_select (thd=0x7fb9b33bb070, lex=0x7fb9b33becb0, result=0x7fb9a4c13550, setup_tables_done_option=0) at /home/elenst/bzr/10.0/sql/sql_select.cc:372 #21 0x000000000067bd51 in execute_sqlcom_select (thd=0x7fb9b33bb070, all_tables=0x0) at /home/elenst/bzr/10.0/sql/sql_parse.cc:5306 #22 0x000000000067411c in mysql_execute_command (thd=0x7fb9b33bb070) at /home/elenst/bzr/10.0/sql/sql_parse.cc:2590 #23 0x000000000067e4db in mysql_parse (thd=0x7fb9b33bb070, rawbuf=0x7fb9a4c13088 "select PREG_CAPTURE( '/(fox)/' , 'The brown fox' )", length=50, parser_state=0x7fb9d3152610) at /home/elenst/bzr/10.0/sql/sql_parse.cc:6452 #24 0x0000000000671294 in dispatch_command (command=COM_QUERY, thd=0x7fb9b33bb070, packet=0x7fb9b36a0071 "", packet_length=50) at /home/elenst/bzr/10.0/sql/sql_parse.cc:1308 #25 0x0000000000670636 in do_command (thd=0x7fb9b33bb070) at /home/elenst/bzr/10.0/sql/sql_parse.cc:1005 #26 0x000000000078b46e in do_handle_one_connection (thd_arg=0x7fb9b33bb070) at /home/elenst/bzr/10.0/sql/sql_connect.cc:1379 #27 0x000000000078b1c1 in handle_one_connection (arg=0x7fb9b33bb070) at /home/elenst/bzr/10.0/sql/sql_connect.cc:1293 #28 0x0000000000a30f90 in pfs_spawn_thread (arg=0x7fb9b33e6170) at /home/elenst/bzr/10.0/storage/perfschema/pfs.cc:1853 #29 0x00007fb9d2e30b50 in start_thread (arg=<optimized out>) at pthread_create.c:304 #30 0x00007fb9d133ba7d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 No failures on 5.5, even with jemalloc.
            Hide
            koenc Koen Crijns added a comment - - edited

            Update: I've disabled the lib_mysqludf_preg plugin and upgraded again from MariaDB 5.5.36 to MariaDB 10.0.10. So far running for 45 minutes without any crashes, so the problem seems to be solved for us.

            (Offtopic: the performance difference between PREG_REPLACE from the UDF plugin and REGEXP_REPLACE from MariaDB 10.0.10 is incredible, also on our production box!)

            Show
            koenc Koen Crijns added a comment - - edited Update: I've disabled the lib_mysqludf_preg plugin and upgraded again from MariaDB 5.5.36 to MariaDB 10.0.10. So far running for 45 minutes without any crashes, so the problem seems to be solved for us. (Offtopic: the performance difference between PREG_REPLACE from the UDF plugin and REGEXP_REPLACE from MariaDB 10.0.10 is incredible, also on our production box!)
            Hide
            serg Sergei Golubchik added a comment -

            This is a bug in lib_mysqludf_preg.
            See the stack trace — everything is clear from there.
            The memory is freed using the free() function in pregMoveToReturnValues(). But the memory was allocated in preg_capture() by pcre_get_substring(). The manpage for the latter function says

            The memory in which the substring is placed is obtained by calling
            pcre_malloc(). The convenience function pcre_free_substring() can be
            used to free it when it is no longer needed.

            So, one must use pcre_free_substring() and not free() in this case.

            What really happens here — I suspect that the memory is allocated using system malloc, but passed to jemalloc for freeing. Thus the crash. Because jemalloc has provided its own free function, but pcre was loaded before that and pcre_malloc() calls the real system malloc.

            A fix would be to use pcre_free_substring() as documented. Or set pcre_malloc() to use jemalloc. Or, better, both.

            Show
            serg Sergei Golubchik added a comment - This is a bug in lib_mysqludf_preg. See the stack trace — everything is clear from there. The memory is freed using the free() function in pregMoveToReturnValues() . But the memory was allocated in preg_capture() by pcre_get_substring() . The manpage for the latter function says The memory in which the substring is placed is obtained by calling pcre_malloc() . The convenience function pcre_free_substring() can be used to free it when it is no longer needed. So, one must use pcre_free_substring() and not free() in this case. What really happens here — I suspect that the memory is allocated using system malloc, but passed to jemalloc for freeing. Thus the crash. Because jemalloc has provided its own free function, but pcre was loaded before that and pcre_malloc() calls the real system malloc. A fix would be to use pcre_free_substring() as documented. Or set pcre_malloc() to use jemalloc. Or, better, both.

              People

              • Assignee:
                serg Sergei Golubchik
                Reporter:
                koenc Koen Crijns
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: