Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5924

MariaDB could crash after changing the query_cache size with SET GLOBAL

    Details

    • Type: Bug
    • Status: Stalled
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 5.3.12, 5.5.36, 10.0.9
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Hi,

      I've just experienced a crash with mariadb 5.5.34 a few hours after changing the query cache size live.
      Initialy, my query cache size was set to 128M.
      I changed it several time to make some tests on a slave :
      8M, 1M, 512K, 0, then back to 16M

      The syntax used was : SET GLOBAL query_cache_size=8*1024*1024;

      2h after those changes Mariadb finally crashed in the query cache code :

      Server version: 5.5.34-MariaDB-1~squeeze-log
      key_buffer_size=3221225472
      read_buffer_size=4194304
      max_used_connections=206
      max_threads=402
      thread_count=39
      It is possible that mysqld could use up to 
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 6446073 K  bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.
      
      Thread pointer: 0x0x7f9f5c22a6c0
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x7f9f8d97fe58 thread_stack 0x48000
      (my_addr_resolve failure: fork)
      /usr/sbin/mysqld(my_print_stacktrace+0x2e) [0x7fbabbf6521e]
      /usr/sbin/mysqld(handle_fatal_signal+0x4ac) [0x7fbabbb6541c]
      /lib/libpthread.so.0(+0xeff0) [0x7fbabb25dff0]
      /usr/sbin/mysqld(Query_cache::register_all_tables(THD*, Query_cache_block*, TABLE_LIST*, unsigned int)+0x11f) [0x7fbabb9cd69f]
      /usr/sbin/mysqld(Query_cache::store_query(THD*, TABLE_LIST*)+0x42b) [0x7fbabb9cdbcb]
      /usr/sbin/mysqld(+0x379f8f) [0x7fbabba04f8f]
      /usr/sbin/mysqld(mysql_execute_command(THD*)+0x3427) [0x7fbabba0c0c7]
      /usr/sbin/mysqld(mysql_parse(THD*, char*, unsigned int, Parser_state*)+0x29b) [0x7fbabba0ef4b]
      /usr/sbin/mysqld(dispatch_command(enum_server_command, THD*, char*, unsigned int)+0x17f2) [0x7fbabba10802]
      /usr/sbin/mysqld(do_handle_one_connection(THD*)+0x419) [0x7fbabbad7219]
      /usr/sbin/mysqld(handle_one_connection+0x51) [0x7fbabbad72d1]
      /lib/libpthread.so.0(+0x68ca) [0x7fbabb2558ca]
      /lib/libc.so.6(clone+0x6d) [0x7fbab9bd4b6d]
      
      Trying to get some variables.
      Some pointers may be invalid and cause the dump to abort.
      Query (0x7f9f74717898): is an invalid pointer
      Connection ID (thread ID): 1157940
      Status: NOT_KILLED
      
      Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=off
      

      It seems the issue is quite difficult to reproduce, but hopefully with the stack you'll get some clues of what could have been corrupted.

      Thanks and regards,
      Jocelyn Fournier

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            elenst Elena Stepanova added a comment - - edited

            Here is one possible way to get a crash by dynamic changing of query_cache_size. The assertion fails quite easily on debug builds, although I haven't got a crash on a release build, hence couldn't get the same stack trace as initially described; so it might be caused by either the same or a different problem.

            5.5/sql/sql_cache.cc:1692: int Query_cache::send_result_to_client(THD*, char*, uint): Assertion `query_cache_size != 0' failed.
            140324 10:30:53 [ERROR] mysqld got signal 6 ;
            
            #6  0x00007ffd09437621 in *__GI___assert_fail (assertion=0xd7b5b3 "query_cache_size != 0", file=<optimized out>, line=1692, function=0xd7dd20 "int Query_cache::send_result_to_client(THD*, char*, uint)") at assert.c:81
            #7  0x00000000005e94e9 in Query_cache::send_result_to_client (this=0x1585aa0, thd=0x7ffcef2a4060, org_sql=0x7ffce9027078 "SELECT   SQL_SMALL_RESULT table3 . `col_date_nokey` AS field1 , table2 . `col_int_key` AS field2 , table2 . `col_date_key` AS field3 , table1 . `col_date_nokey` AS field4 , table1 . `col_date_key` AS "..., query_length=1109) at /home/elenst/bzr/5.5/sql/sql_cache.cc:1692
            #8  0x000000000063aa0c in mysql_parse (thd=0x7ffcef2a4060, rawbuf=0x7ffce9027078 "SELECT   SQL_SMALL_RESULT table3 . `col_date_nokey` AS field1 , table2 . `col_int_key` AS field2 , table2 . `col_date_key` AS field3 , table1 . `col_date_nokey` AS field4 , table1 . `col_date_key` AS "..., length=1109, parser_state=0x7ffd0ac34680) at /home/elenst/bzr/5.5/sql/sql_parse.cc:5751
            #9  0x000000000062eba9 in dispatch_command (command=COM_QUERY, thd=0x7ffcef2a4060, packet=0x7ffcef2a9061 " SELECT   SQL_SMALL_RESULT table3 . `col_date_nokey` AS field1 , table2 . `col_int_key` AS field2 , table2 . `col_date_key` AS field3 , table1 . `col_date_nokey` AS field4 , table1 . `col_date_key` AS"..., packet_length=1111) at /home/elenst/bzr/5.5/sql/sql_parse.cc:1079
            #10 0x000000000062dd35 in do_command (thd=0x7ffcef2a4060) at /home/elenst/bzr/5.5/sql/sql_parse.cc:793
            #11 0x000000000072f132 in do_handle_one_connection (thd_arg=0x7ffcef2a4060) at /home/elenst/bzr/5.5/sql/sql_connect.cc:1266
            #12 0x000000000072ebf1 in handle_one_connection (arg=0x7ffcef2a4060) at /home/elenst/bzr/5.5/sql/sql_connect.cc:1181
            #13 0x00007ffd0a997b50 in start_thread (arg=<optimized out>) at pthread_create.c:304
            #14 0x00007ffd094e6a7d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
            

            Stack trace from:

            revision-id: sergii@pisem.net-20140318172907-ra8nysurfm3jip3h
            date: 2014-03-18 18:29:07 +0100
            build-date: 2014-03-24 11:51:12 +0400
            revno: 4114
            branch-nick: 5.5
            

            RQG redefining grammar (mdev5924.yy):

            thread1:
            	SET GLOBAL query_cache_size = qcache_size ;
            
            qcache_size:
            	8*1024*1024 | 1024*1024 | 512*1024 | 0 | 16*1024*1024 ;
            

            RQG command line:

            perl ./runall.pl --threads=2 --duration=400 --queries=100M --grammar=conf/optimizer/optimizer_access_exp.yy --redefine=mdev5924.yy --basedir=<your basedir>
            
            Show
            elenst Elena Stepanova added a comment - - edited Here is one possible way to get a crash by dynamic changing of query_cache_size. The assertion fails quite easily on debug builds, although I haven't got a crash on a release build, hence couldn't get the same stack trace as initially described; so it might be caused by either the same or a different problem. 5.5/sql/sql_cache.cc:1692: int Query_cache::send_result_to_client(THD*, char*, uint): Assertion `query_cache_size != 0' failed. 140324 10:30:53 [ERROR] mysqld got signal 6 ; #6 0x00007ffd09437621 in *__GI___assert_fail (assertion=0xd7b5b3 "query_cache_size != 0", file=<optimized out>, line=1692, function=0xd7dd20 "int Query_cache::send_result_to_client(THD*, char*, uint)") at assert.c:81 #7 0x00000000005e94e9 in Query_cache::send_result_to_client (this=0x1585aa0, thd=0x7ffcef2a4060, org_sql=0x7ffce9027078 "SELECT SQL_SMALL_RESULT table3 . `col_date_nokey` AS field1 , table2 . `col_int_key` AS field2 , table2 . `col_date_key` AS field3 , table1 . `col_date_nokey` AS field4 , table1 . `col_date_key` AS "..., query_length=1109) at /home/elenst/bzr/5.5/sql/sql_cache.cc:1692 #8 0x000000000063aa0c in mysql_parse (thd=0x7ffcef2a4060, rawbuf=0x7ffce9027078 "SELECT SQL_SMALL_RESULT table3 . `col_date_nokey` AS field1 , table2 . `col_int_key` AS field2 , table2 . `col_date_key` AS field3 , table1 . `col_date_nokey` AS field4 , table1 . `col_date_key` AS "..., length=1109, parser_state=0x7ffd0ac34680) at /home/elenst/bzr/5.5/sql/sql_parse.cc:5751 #9 0x000000000062eba9 in dispatch_command (command=COM_QUERY, thd=0x7ffcef2a4060, packet=0x7ffcef2a9061 " SELECT SQL_SMALL_RESULT table3 . `col_date_nokey` AS field1 , table2 . `col_int_key` AS field2 , table2 . `col_date_key` AS field3 , table1 . `col_date_nokey` AS field4 , table1 . `col_date_key` AS"..., packet_length=1111) at /home/elenst/bzr/5.5/sql/sql_parse.cc:1079 #10 0x000000000062dd35 in do_command (thd=0x7ffcef2a4060) at /home/elenst/bzr/5.5/sql/sql_parse.cc:793 #11 0x000000000072f132 in do_handle_one_connection (thd_arg=0x7ffcef2a4060) at /home/elenst/bzr/5.5/sql/sql_connect.cc:1266 #12 0x000000000072ebf1 in handle_one_connection (arg=0x7ffcef2a4060) at /home/elenst/bzr/5.5/sql/sql_connect.cc:1181 #13 0x00007ffd0a997b50 in start_thread (arg=<optimized out>) at pthread_create.c:304 #14 0x00007ffd094e6a7d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 Stack trace from: revision-id: sergii@pisem.net-20140318172907-ra8nysurfm3jip3h date: 2014-03-18 18:29:07 +0100 build-date: 2014-03-24 11:51:12 +0400 revno: 4114 branch-nick: 5.5 RQG redefining grammar (mdev5924.yy): thread1: SET GLOBAL query_cache_size = qcache_size ; qcache_size: 8*1024*1024 | 1024*1024 | 512*1024 | 0 | 16*1024*1024 ; RQG command line: perl ./runall.pl --threads=2 --duration=400 --queries=100M --grammar=conf/optimizer/optimizer_access_exp.yy --redefine=mdev5924.yy --basedir=<your basedir>
            Hide
            serg Sergei Golubchik added a comment -

            Elena Stepanova, I've fixed the assertion crash that you've reported in a comment. But it's not the original issue.

            jocelyn fournier got a crash in Query_cache::register_all_tables, to repeat that one needs a cacheable query that accesses tables.

            Your assertion crash looks completely innocent, it couldn't have caused problems in the optimized build (I believe).

            So, I've fixed the assertion, but I'm hesitant to close the bug report. I'm reassigning it back to you. You can try to repeat the original crash or close as "Can't repeat" — it's up to you now.

            Show
            serg Sergei Golubchik added a comment - Elena Stepanova , I've fixed the assertion crash that you've reported in a comment. But it's not the original issue. jocelyn fournier got a crash in Query_cache::register_all_tables , to repeat that one needs a cacheable query that accesses tables. Your assertion crash looks completely innocent, it couldn't have caused problems in the optimized build (I believe). So, I've fixed the assertion, but I'm hesitant to close the bug report. I'm reassigning it back to you. You can try to repeat the original crash or close as "Can't repeat" — it's up to you now.
            Hide
            elenst Elena Stepanova added a comment -

            I will try to reproduce it.

            Show
            elenst Elena Stepanova added a comment - I will try to reproduce it.

              People

              • Assignee:
                elenst Elena Stepanova
                Reporter:
                jocel1 jocelyn fournier
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0 minutes
                  0m
                  Logged:
                  Time Spent - 15 minutes
                  15m