Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 10.0.14
    • Fix Version/s: 10.0.15
    • Component/s: Optimizer
    • Labels:
      None
    • Environment:
      Redhat Linux 2.6.32-431.29.2.el6.x86_64
      8 vcpu with 16GB RAM on Xen Host

      Description

      Server version: 10.0.14-MariaDB-log
      key_buffer_size=44040192
      read_buffer_size=1048576
      max_used_connections=1850
      max_threads=503
      thread_count=203
      It is possible that mysqld could use up to
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1671067 K  bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.
      
      Thread pointer: 0x0x7f737b4df008
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x7f7395c94c48 thread_stack 0x48000
      /usr/sbin/mysqld(my_print_stacktrace+0x2b)[0xb6b85b]
      /usr/sbin/mysqld(handle_fatal_signal+0x398)[0x723f08]
      /lib64/libpthread.so.0(+0xf710)[0x7f7669c97710]
      /usr/sbin/mysqld[0x5f3588]
      /usr/sbin/mysqld[0x601364]
      /usr/sbin/mysqld[0x60198b]
      /usr/sbin/mysqld(_Z11choose_planP4JOINy+0xf5)[0x601ef5]
      /usr/sbin/mysqld[0x52a437]
      /usr/sbin/mysqld(_ZN4JOIN14optimize_innerEv+0x6de)[0x61d3de]
      /usr/sbin/mysqld(_ZN4JOIN8optimizeEv+0x28)[0x61f898]
      /usr/sbin/mysqld(_ZN18st_select_lex_unit8optimizeEv+0x1e6)[0x65eac6]
      /usr/sbin/mysqld(_ZN18st_select_lex_unit4execEv+0x82)[0x65ec32]
      /usr/sbin/mysqld(_Z11mysql_unionP3THDP3LEXP13select_resultP18st_select_lex_unitm+0x2e)[0x66057e]
      /usr/sbin/mysqld(_Z13handle_selectP3THDP3LEXP13select_resultm+0x85)[0x623145]
      /usr/sbin/mysqld[0x5cac06]
      /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x4c97)[0x5d58d7]
      /usr/sbin/mysqld[0x5d7402]
      /usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1b20)[0x5d95c0]
      /usr/sbin/mysqld(_Z26threadpool_process_requestP3THD+0xa7)[0x6ca037]
      /usr/sbin/mysqld[0x6f94fd]
      /usr/sbin/mysqld[0xa6137d]
      /lib64/libpthread.so.0(+0x79d1)[0x7f7669c8f9d1]
      /lib64/libc.so.6(clone+0x6d)[0x7f76683ac86d]
      
      Trying to get some variables.
      Some pointers may be invalid and cause the dump to abort.
      Query (0x7f73b07f4020): is an invalid pointer
      Connection ID (thread ID): 132701
      Status: NOT_KILLED
      
      Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=on,mrr_cost_based=off,mrr_sort_keys=on,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on
      
      The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
      information that should help you find out what is causing the crash.
      141206 00:04:58 mysqld_safe Number of processes running now: 0
      141206 00:04:58 mysqld_safe mysqld restarted
      141206  0:05:00 [Warning] option 'table_cache': unsigned value 2097152 adjusted to 524288
      141206  0:05:00 [Warning] option 'table_definition_cache': unsigned value 2097152 adjusted to 524288
      141206  0:05:00 [Note] InnoDB: Using mutexes to ref count buffer pool pages
      141206  0:05:00 [Note] InnoDB: The InnoDB memory heap is disabled
      141206  0:05:00 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
      141206  0:05:00 [Note] InnoDB: Memory barrier is not used
      141206  0:05:00 [Note] InnoDB: Compressed tables use zlib 1.2.3
      141206  0:05:00 [Note] InnoDB: Using Linux native AIO
      141206  0:05:00 [Note] InnoDB: Using CPU crc32 instructions
      141206  0:05:00 [Note] InnoDB: Initializing buffer pool, size = 8.0G
      141206  0:05:01 [Note] InnoDB: Completed initialization of buffer pool
      141206  0:05:01 [Note] InnoDB: Highest supported file format is Barracuda.
      141206  0:05:01 [Note] InnoDB: Log scan progressed past the checkpoint lsn 484491799379
      141206  0:05:01 [Note] InnoDB: Database was not shutdown normally!
      141206  0:05:01 [Note] InnoDB: Starting crash recovery.
      141206  0:05:01 [Note] InnoDB: Reading tablespace information from the .ibd files...
      141206  0:05:02 [Note] InnoDB: Restoring possible half-written data pages
      141206  0:05:02 [Note] InnoDB: from the doublewrite buffer...
      InnoDB: Doing recovery: scanned up to log sequence number 484497041920
      InnoDB: Doing recovery: scanned up to log sequence number 484502284800
      InnoDB: Doing recovery: scanned up to log sequence number 484507527680
      InnoDB: Doing recovery: scanned up to log sequence number 484512770560
      InnoDB: Doing recovery: scanned up to log sequence number 484518013440
      InnoDB: Doing recovery: scanned up to log sequence number 484523256320
      InnoDB: Doing recovery: scanned up to log sequence number 484528499200
      InnoDB: Doing recovery: scanned up to log sequence number 484533742080
      InnoDB: Doing recovery: scanned up to log sequence number 484538984960
      InnoDB: Doing recovery: scanned up to log sequence number 484540297680
      InnoDB: Transaction 479796262 was in the XA prepared state.
      InnoDB: 1 transaction(s) which must be rolled back or cleaned up
      InnoDB: in total 0 row operations to undo
      InnoDB: Trx id counter is 479796736
      141206  0:05:03 [Note] InnoDB: Starting an apply batch of log records to the database...
      InnoDB: Progress in percent: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
      InnoDB: Apply batch completed
      InnoDB: In a MySQL replication slave the last master binlog file
      InnoDB: position 0 998653724, file name mysql-bin.000064
      InnoDB: Last MySQL binlog file position 0 666279708, file name ./mysql-bin.000004
      141206  0:05:05 [Note] InnoDB: 128 rollback segment(s) are active.
      InnoDB: Starting in background the rollback of uncommitted transactions
      2014-12-06 00:05:05 7fe33b7ff700  InnoDB: Rollback of non-prepared transactions completed
      141206  0:05:05 [Note] InnoDB: Waiting for purge to start
      141206  0:05:05 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.20-68.0 started; log sequence number 484540297680
      141206  0:05:05 [Note] Plugin 'FEEDBACK' is disabled.
      141206  0:05:06 [Note] CONNECT: Version 1.03.0003 Sep 25 2014 11:08:21
      141206  0:05:06 [Note] Recovering after a crash using mysql-bin
      141206  0:05:12 [Note] Starting crash recovery...
      2014-12-06 00:05:12 7fe5bfa70820  InnoDB: Starting recovery for XA transactions...
      2014-12-06 00:05:12 7fe5bfa70820  InnoDB: Transaction 479796262 in prepared state after recovery
      2014-12-06 00:05:12 7fe5bfa70820  InnoDB: Transaction contains changes to 1 rows
      2014-12-06 00:05:12 7fe5bfa70820  InnoDB: 1 transactions in prepared state after recovery
      141206  0:05:12 [Note] Found 1 prepared transaction(s) in InnoDB
      141206  0:05:12 [Note] Crash recovery finished.
      141206  0:05:12 [ERROR] Failed to create a socket for IPv6 '::': errno: 97.
      141206  0:05:12 [Note] Server socket created on IP: '0.0.0.0'.
      141206  0:05:12 [ERROR] Failed to create a socket for IPv6 '::': errno: 97.
      141206  0:05:12 [Note] Server socket created on IP: '0.0.0.0'.
      141206  0:05:12 [Warning] 'user' entry 'itroot@rhel-mysql-test1' ignored in --skip-name-resolve mode.
      141206  0:05:12 [ERROR] mysqld: Table './mysql/event' is marked as crashed and should be repaired
      141206  0:05:12 [Warning] Checking table:   './mysql/event'
      141206  0:05:12 [ERROR] mysql.event: 1 client is using or hasn't closed the table properly
      141206  0:05:12 [Note] Event Scheduler: Loaded 1 event
      141206  0:05:12 [Note] /usr/sbin/mysqld: ready for connections.
      Version: '10.0.14-MariaDB-log'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MariaDB Server
      141206  0:05:34 [ERROR] mysqld got signal 11 ;
      This could be because you hit a bug. It is also possible that this binary
      or one of the libraries it was linked against is corrupt, improperly built,
      or misconfigured. This error can also be caused by malfunctioning hardware.
      

      Couldn't restart with optimizer_switch parameters kept failing on that
      It then recycled in restarting the database until the optimizer_switch and optimizer_use_condition_selectivity parameters were removed from server.cnf

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

              Hide
              Meerkat63 Peter McLarty added a comment -

              Interesting event - the buffer pool shows all pages free from around midnight before the crash

              Show
              Meerkat63 Peter McLarty added a comment - Interesting event - the buffer pool shows all pages free from around midnight before the crash
              Hide
              Meerkat63 Peter McLarty added a comment -

              I have conducted some testing today to see if i can better isolate the cause of this problem
              I have a smaller test server to conducting recovery tests of databases. It has a copy of very recent production data.

              I was using Jmeter to run a series of real queries from our production database which don't always perform well and would likely benefit from histograms. I was only running 2 test threads in Jmeter

              Setting the optimizer_use_condition_selectivity to either 4 or 5 will crash this database, below is the optimizer switch settings from the crash entry in that log,

              Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on

              Show
              Meerkat63 Peter McLarty added a comment - I have conducted some testing today to see if i can better isolate the cause of this problem I have a smaller test server to conducting recovery tests of databases. It has a copy of very recent production data. I was using Jmeter to run a series of real queries from our production database which don't always perform well and would likely benefit from histograms. I was only running 2 test threads in Jmeter Setting the optimizer_use_condition_selectivity to either 4 or 5 will crash this database, below is the optimizer switch settings from the crash entry in that log, Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on
              Hide
              elenst Elena Stepanova added a comment -

              Hi Peter,

              Would you be able to provide the query and involved data structures (or better datadump from these tables, if possible)?
              If it includes sensitive information, you can upload it to our ftp.askmonty.org/private, so that only MariaDB developers will have access to it.

              Also, in the initial description you quoted the error log, but the quote ends 3 lines after 'signal 11'. Is it all yo uhad in the log? Wasn't there at least an attempt of printing a stack trace, followed by the connection id (and possibly the query) where the crash happened?

              Show
              elenst Elena Stepanova added a comment - Hi Peter, Would you be able to provide the query and involved data structures (or better datadump from these tables, if possible)? If it includes sensitive information, you can upload it to our ftp.askmonty.org/private, so that only MariaDB developers will have access to it. Also, in the initial description you quoted the error log, but the quote ends 3 lines after 'signal 11'. Is it all yo uhad in the log? Wasn't there at least an attempt of printing a stack trace, followed by the connection id (and possibly the query) where the crash happened?
              Hide
              Meerkat63 Peter McLarty added a comment -

              I have added the additional information to the description, missed it originally copying and pasting buffer contents didn't completely copy. Not sure what tables and query I am looking for? I will have to speak to my manager about what i can provide in the way of schema and data dump objects. Need to be aware of Australian privacy laws if the table contains sensitive data

              Show
              Meerkat63 Peter McLarty added a comment - I have added the additional information to the description, missed it originally copying and pasting buffer contents didn't completely copy. Not sure what tables and query I am looking for? I will have to speak to my manager about what i can provide in the way of schema and data dump objects. Need to be aware of Australian privacy laws if the table contains sensitive data
              Hide
              elenst Elena Stepanova added a comment -

              Hi Peter,

              Thank you.

              Even if you can't provide the data, please paste the table structures and of course the query. You can obfuscate the table and column names (call them col1, col2 etc. if you wish, just do it consistently through the tables and the query). I hope this way there will be no problem with the law.

              As I understand from your previous query, you are able to reproduce the crash. In this case, locating the exact query and involved tables is not difficult.
              Before the next time you reproduce it, please turn on general log (SET GLOBAL general_log=1 or put general_log to your cnf file). Optionally, also set general_log_file=<file location> to make sure it's written where you want.
              After the server crashes, you'll see in your error log the stack trace similar to the one that you pasted. Please note the following lines after the stack trace:

              Trying to get some variables.
              Some pointers may be invalid and cause the dump to abort.
              Query (0x7f73b07f4020): is an invalid pointer
              Connection ID (thread ID): 132701
              

              In the general log you created, you will see numerous records of the kind

                   123344 Query SELECT ...
                   133454 Query UPDATE ...
              

              etc.

              The first number is the connection ID, which corresponds the Connection ID from the error log above.
              Find the LAST query printed in the general log which has the same ID as the one printed in the error log. In all likelihood it will be the guilty query.
              Then just look at it to see which tables it uses, their structures along with the query is what we want to see.

              Don't forget to disable general log again, as it can affect performance.

              Thanks.

              Show
              elenst Elena Stepanova added a comment - Hi Peter, Thank you. Even if you can't provide the data, please paste the table structures and of course the query. You can obfuscate the table and column names (call them col1, col2 etc. if you wish, just do it consistently through the tables and the query). I hope this way there will be no problem with the law. As I understand from your previous query, you are able to reproduce the crash. In this case, locating the exact query and involved tables is not difficult. Before the next time you reproduce it, please turn on general log (SET GLOBAL general_log=1 or put general_log to your cnf file). Optionally, also set general_log_file=<file location> to make sure it's written where you want. After the server crashes, you'll see in your error log the stack trace similar to the one that you pasted. Please note the following lines after the stack trace: Trying to get some variables. Some pointers may be invalid and cause the dump to abort. Query (0x7f73b07f4020): is an invalid pointer Connection ID (thread ID): 132701 In the general log you created, you will see numerous records of the kind 123344 Query SELECT ... 133454 Query UPDATE ... etc. The first number is the connection ID, which corresponds the Connection ID from the error log above. Find the LAST query printed in the general log which has the same ID as the one printed in the error log. In all likelihood it will be the guilty query. Then just look at it to see which tables it uses, their structures along with the query is what we want to see. Don't forget to disable general log again, as it can affect performance. Thanks.
              Hide
              Meerkat63 Peter McLarty added a comment -

              I have uploaded to ftp some information on this. I can now reproduce the crash. It seems the problem won't occur when there is no new data being added to the database as I test instance, restored copy would not break.
              I started to test against our devs main instance and as that routinely has test cases that generate data, and I can get the failure there.

              I have uploaded some gdb info, the schema of the tables which seem to have been involved and the query that was the last in the general.log at the time of the failure

              Show
              Meerkat63 Peter McLarty added a comment - I have uploaded to ftp some information on this. I can now reproduce the crash. It seems the problem won't occur when there is no new data being added to the database as I test instance, restored copy would not break. I started to test against our devs main instance and as that routinely has test cases that generate data, and I can get the failure there. I have uploaded some gdb info, the schema of the tables which seem to have been involved and the query that was the last in the general.log at the time of the failure
              Hide
              elenst Elena Stepanova added a comment -

              Thanks, I could now reproduce the crash on 10.0.14.
              It was fixed in 10.0.15 by the following revision:

              revno: 4420
              revision-id: psergey@askmonty.org-20140930171103-29ti2smqbo9wydj1
              parent: psergey@askmonty.org-20140926144840-cd94pd4bvdia8t19
              committer: Sergey Petrunya <psergey@askmonty.org>
              branch nick: 10.0-cp
              timestamp: Tue 2014-09-30 21:11:03 +0400
              message:
                MDEV-6808: MariaDB 10.0.13 crash with optimizer_use_condition_selectivity > 1
                
                Fix a trivial typo: in table_multi_eq_cond_selectivity(), reset the loop variable
                between loops.
              
              Show
              elenst Elena Stepanova added a comment - Thanks, I could now reproduce the crash on 10.0.14. It was fixed in 10.0.15 by the following revision: revno: 4420 revision-id: psergey@askmonty.org-20140930171103-29ti2smqbo9wydj1 parent: psergey@askmonty.org-20140926144840-cd94pd4bvdia8t19 committer: Sergey Petrunya <psergey@askmonty.org> branch nick: 10.0-cp timestamp: Tue 2014-09-30 21:11:03 +0400 message: MDEV-6808: MariaDB 10.0.13 crash with optimizer_use_condition_selectivity > 1 Fix a trivial typo: in table_multi_eq_cond_selectivity(), reset the loop variable between loops.

                People

                • Assignee:
                  Unassigned
                  Reporter:
                  Meerkat63 Peter McLarty
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  3 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved: