Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5167

Complex DELETE caused mariadb-galera-cluster node to abend with Signal 11

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Cannot Reproduce
    • Affects Version/s: 5.5.33a-galera
    • Fix Version/s: 10.0.17-galera
    • Component/s: wsrep
    • Labels:
      None
    • Environment:
      Debian Wheezy

      Description

      The following DELETE causes mariadb-galera-server to abend with signal 11 (and sometimes signal 6). The server crash logs follow the delete statement. Note that the DELETE statement works if broken into smaller pieces.

      DELETE FROM period WHERE process IN ('reconcile','statistics') AND (
        (item_period < '20130830' AND rec_id IN ('10002607', '300032001')) OR
        (item_period < '20130902' AND rec_id IN ('10002530', '10002598', '10003238', '290032001')) OR
        (item_period < '20130903' AND rec_id IN ('10000343', '10828288', '260032001')) OR
        (item_period < '20130905' AND rec_id IN ('10000854', '10002447', '10002472', '10002550', '10002561', '10003120', '100032001', '10003409', '10004172', '10004555', '10004858', '10004861', '10023903', '10032001', '10085903', '110032001', '110132001', '11579596', '11579769', '120032001', '130032001', '150032001', '160032001', '170032001', '190032001', '190132001', '20032001', '250032001', '320051001', '320080001', '320081001', '320082001', '40032001', '60032001', '70032001', '90032001')) OR
        (item_period < '20130905' AND item_period NOT IN ('20130801', '20130802', '20130805', '20130806', '20130807', '20130808', '20130809', '20130812', '20130813', '20130814', '20130815', '20130816', '20130819', '20130820', '20130821', '20130822', '20130823', '20130826', '20130827') AND
          rec_id IN ('11301288')) OR
        (item_period < '20130906' AND rec_id IN ('10001410')) OR
        (item_period < '20130926' AND rec_id IN ('10000346', '10000347', '10002716', '10003510')) OR
        (item_period < '20130930' AND rec_id IN ('10000374', '10000375', '10001810', '10001811', '10750288', '10836288', '10842288', '10966288', '11140288', '11141288', '11142288', '11144288', '11145288', '11146288', '11147288', '11148288', '11149288', '11150288', '11151288', '11152288', '11153288', '11169288', '11170288', '11231288', '11232288', '11237288', '11238288', '11239288', '11483288', '11484288', '11579609', '11579650', '11579700', '11579701', '11579739', '250032004', '250032005', '70032004', '70032005', '90032002', '90032004')) OR
        (item_period < '20131004' AND rec_id IN ('10001418', '10003519', '11579328')) OR
        (item_period < '20131007' AND rec_id IN ('11172288', '300032002')) OR
        (item_period < '20131008' AND rec_id IN ('10001839', '10002531', '10002600', '10003239', '10105903', '10523172', '10749288', '10784288', '10835288', '10837288', '11049288', '11426288', '11485288', '11486288', '11579302', '11579303', '11579387', '11579389', '11579604', '11579638', '11579740', '11579849', '290032002')) OR
        (item_period < '20131009' AND rec_id IN ('10000344', '10829288', '260032002')) OR
        (item_period < '20131010' AND rec_id IN ('10000855', '10002448', '10002474', '10002549', '10002560', '10003119', '10003408', '10004171', '10004556', '10004859', '10004862', '10022903', '10032002', '10084903', '110032002', '110132002', '11579625', '11579770', '120032002', '130032002', '150032002', '160032002', '170032002', '190032002', '190132002', '20032002', '250032002', '320051002', '320080002', '320081002', '320082002', '40032002', '60032002', '70032002', '90032005')) OR
        (item_period < '20131010' AND item_period NOT IN ('20130801', '20130802', '20130805', '20130806', '20130807', '20130808', '20130809', '20130812', '20130813', '20130814', '20130815', '20130816', '20130819', '20130820', '20130821', '20130822', '20130823', '20130826', '20130827') AND rec_id IN ('11300288'))
      )
      

      and the server logs...

      mysqld: 131021 21:53:14 [ERROR] mysqld got signal 11 ;
      mysqld: This could be because you hit a bug. It is also possible that this binary
      mysqld: or one of the libraries it was linked against is corrupt, improperly built,
      mysqld: or misconfigured. This error can also be caused by malfunctioning hardware.
      mysqld:
      mysqld: To report this bug, see http://kb.askmonty.org/en/reporting-bugs
      mysqld:
      mysqld: We will try our best to scrape up some info that will hopefully help
      mysqld: diagnose the problem, but since we have already crashed,
      mysqld: something is definitely wrong and this may fail.
      mysqld:
      mysqld: Server version: 5.5.33a-MariaDB-1~wheezy-log
      mysqld: key_buffer_size=536870912
      mysqld: read_buffer_size=2097152
      mysqld: max_used_connections=501
      mysqld: max_threads=10002
      mysqld: thread_count=143
      mysqld: It is possible that mysqld could use up to
      mysqld: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 62159406 K  bytes of memory                                                  
      mysqld: Hope that's ok; if not, decrease some variables in the equation.                                                                                   
      mysqld:                                                                                                                                                    
      mysqld: Thread pointer: 0x0x7f0c92b45000                                                                                                                   
      mysqld: Attempting backtrace. You can use the following information to find out                                                                            
      mysqld: where mysqld died. If you see no messages after this, something went                                                                               
      mysqld: terribly wrong...                                                                                                                                  
      mysqld: stack_bottom = 0x7f27c7cccdb0 thread_stack 0x48000                                                                                                 
      mysqld: ??:0(??)[0x7f8bcd8a3e5b]                                                                                                                           
      mysqld: ??:0(??)[0x7f8bcd4d58d2]                                                                                                                           
      mysqld: ??:0(??)[0x7f8bccbb3030]                                                                                                                           
      mysqld: ??:0(??)[0x7f8bcd5aef34]                                                                                                                           
      mysqld: ??:0(??)[0x7f8bcd5b8720]                                                                                                                           
      mysqld: ??:0(??)[0x7f8bcd5dfaa8]                                                                                                                           
      mysqld: ??:0(??)[0x7f8bcd390e78]                                                                                                                           
      mysqld: ??:0(??)[0x7f8bcd3943d7]                                                                                                                           
      mysqld: ??:0(??)[0x7f8bcd3947cf]                                                                                                                           
      mysqld: ??:0(??)[0x7f8bcd39652a]                                                                                                                           
      mysqld: ??:0(??)[0x7f8bcd396c59]                                                                                                                           
      mysqld: ??:0(??)[0x7f8bcd44994b]                                                                                                                           
      mysqld: ??:0(??)[0x7f8bcd4499f1]                                                                                                                           
      mysqld: ??:0(??)[0x7f8bccbaab50]                                                                                                                           
      mysqld: ??:0(??)[0x7f8bcb4cea7d]
      mysqld:
      mysqld: Trying to get some variables.
      mysqld: Some pointers may be invalid and cause the dump to abort.
      mysqld: Query (0x7f00bddb2018): is an invalid pointer
      mysqld: Connection ID (thread ID): 1044736
      mysqld: Status: NOT_KILLED
      mysqld:
      mysqld: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engin
      mysqld:
      mysqld: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
      mysqld: information that should help you find out what is causing the crash.
      

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            elenst Elena Stepanova added a comment -

            Could you please paste SHOW CREATE and SHOW INDEX for the table `period`?

            Thanks.

            Show
            elenst Elena Stepanova added a comment - Could you please paste SHOW CREATE and SHOW INDEX for the table `period`? Thanks.
            Hide
            mariadb@aquabolt.com Jeff Armstrong added a comment -

            MariaDB [sbld_testeda]> show create table period;
            CREATE TABLE `period` (
            `client_id` int(10) unsigned NOT NULL,
            `type` char(10) NOT NULL,
            `subtype` varchar(20) NOT NULL,
            `item_period` int(8) unsigned NOT NULL,
            `relation_id` int(10) unsigned NOT NULL,
            `keep_until` int(8) unsigned NOT NULL,
            `rec_id` int(10) unsigned NOT NULL,
            `process` varchar(30) NOT NULL,
            `started` datetime NOT NULL,
            `deleted_ind` char(1) NOT NULL,
            `updated` datetime NOT NULL,
            PRIMARY KEY (`client_id`,`type`,`item_period`,`relation_id`,`rec_id`,`process`,`subtype`),
            KEY `rec_id` (`rec_id`,`item_period`),
            KEY `process` (`process`,`item_period`,`type`,`subtype`),
            KEY `type` (`type`,`process`,`item_period`)
            ) ENGINE=InnoDB DEFAULT CHARSET=utf8 PACK_KEYS=1

            MariaDB [sbld_testeda]> show index in period;
            ---------------------------------------------------------------------------------------------------------------------

            Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type Comment Index_comment

            ---------------------------------------------------------------------------------------------------------------------

            period 0 PRIMARY 1 client_id A 8 NULL NULL   BTREE    
            period 0 PRIMARY 2 type A 48 NULL NULL   BTREE    
            period 0 PRIMARY 3 item_period A 293 NULL NULL   BTREE    
            period 0 PRIMARY 4 relation_id A 293 NULL NULL   BTREE    
            period 0 PRIMARY 5 rec_id A 293 NULL NULL   BTREE    
            period 0 PRIMARY 6 process A 293 NULL NULL   BTREE    
            period 0 PRIMARY 7 subtype A 293 NULL NULL   BTREE    
            period 1 rec_id 1 rec_id A 2 NULL NULL   BTREE    
            period 1 rec_id 2 item_period A 293 NULL NULL   BTREE    
            period 1 process 1 process A 5 NULL NULL   BTREE    
            period 1 process 2 item_period A 293 NULL NULL   BTREE    
            period 1 process 3 type A 293 NULL NULL   BTREE    
            period 1 process 4 subtype A 293 NULL NULL   BTREE    
            period 1 type 1 type A 32 NULL NULL   BTREE    
            period 1 type 2 process A 48 NULL NULL   BTREE    
            period 1 type 3 item_period A 293 NULL NULL   BTREE    

            ---------------------------------------------------------------------------------------------------------------------

            Show
            mariadb@aquabolt.com Jeff Armstrong added a comment - MariaDB [sbld_testeda] > show create table period; CREATE TABLE `period` ( `client_id` int(10) unsigned NOT NULL, `type` char(10) NOT NULL, `subtype` varchar(20) NOT NULL, `item_period` int(8) unsigned NOT NULL, `relation_id` int(10) unsigned NOT NULL, `keep_until` int(8) unsigned NOT NULL, `rec_id` int(10) unsigned NOT NULL, `process` varchar(30) NOT NULL, `started` datetime NOT NULL, `deleted_ind` char(1) NOT NULL, `updated` datetime NOT NULL, PRIMARY KEY (`client_id`,`type`,`item_period`,`relation_id`,`rec_id`,`process`,`subtype`), KEY `rec_id` (`rec_id`,`item_period`), KEY `process` (`process`,`item_period`,`type`,`subtype`), KEY `type` (`type`,`process`,`item_period`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 PACK_KEYS=1 MariaDB [sbld_testeda] > show index in period; ------- ---------- -------- ------------ ----------- --------- ----------- -------- ------ ---- ---------- ------- -------------- Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type Comment Index_comment ------- ---------- -------- ------------ ----------- --------- ----------- -------- ------ ---- ---------- ------- -------------- period 0 PRIMARY 1 client_id A 8 NULL NULL   BTREE     period 0 PRIMARY 2 type A 48 NULL NULL   BTREE     period 0 PRIMARY 3 item_period A 293 NULL NULL   BTREE     period 0 PRIMARY 4 relation_id A 293 NULL NULL   BTREE     period 0 PRIMARY 5 rec_id A 293 NULL NULL   BTREE     period 0 PRIMARY 6 process A 293 NULL NULL   BTREE     period 0 PRIMARY 7 subtype A 293 NULL NULL   BTREE     period 1 rec_id 1 rec_id A 2 NULL NULL   BTREE     period 1 rec_id 2 item_period A 293 NULL NULL   BTREE     period 1 process 1 process A 5 NULL NULL   BTREE     period 1 process 2 item_period A 293 NULL NULL   BTREE     period 1 process 3 type A 293 NULL NULL   BTREE     period 1 process 4 subtype A 293 NULL NULL   BTREE     period 1 type 1 type A 32 NULL NULL   BTREE     period 1 type 2 process A 48 NULL NULL   BTREE     period 1 type 3 item_period A 293 NULL NULL   BTREE     ------- ---------- -------- ------------ ----------- --------- ----------- -------- ------ ---- ---------- ------- --------------
            Hide
            elenst Elena Stepanova added a comment -

            I tried to reproduce the problem on some artificial data, but no luck so far.
            Are you getting the crash in a single-node setup, or is it a cluster? Does it happen on the node where the DELETE is initially performed, or on some other nodes when it attempts to replicate it?

            Show
            elenst Elena Stepanova added a comment - I tried to reproduce the problem on some artificial data, but no luck so far. Are you getting the crash in a single-node setup, or is it a cluster? Does it happen on the node where the DELETE is initially performed, or on some other nodes when it attempts to replicate it?
            Hide
            mariadb@aquabolt.com Jeff Armstrong added a comment -

            The crash occurs in a two-node, one-garbd cluster. The crash seems to occur immediately on the node on which the SQL is executed - the other node continues to run. When the crashed node restarts, it performs IST/SST to recover, and the deletes have not occurred on any node.

            I will try and pin it down further for you - for example I will see if I can cause the crash with wsrep_on=off, but this may take me a few weeks as I will have to do this over the weekends.

            Regards
            Jeff

            Show
            mariadb@aquabolt.com Jeff Armstrong added a comment - The crash occurs in a two-node, one-garbd cluster. The crash seems to occur immediately on the node on which the SQL is executed - the other node continues to run. When the crashed node restarts, it performs IST/SST to recover, and the deletes have not occurred on any node. I will try and pin it down further for you - for example I will see if I can cause the crash with wsrep_on=off, but this may take me a few weeks as I will have to do this over the weekends. Regards Jeff
            Hide
            nirbhay_c Nirbhay Choubey added a comment -

            Jeff Armstrong Did you notice this error on later versions?

            Show
            nirbhay_c Nirbhay Choubey added a comment - Jeff Armstrong Did you notice this error on later versions?
            Hide
            mariadb@aquabolt.com Jeff Armstrong added a comment -

            After running our trial for about 10 months, we decided that the Maria+Galera combo was not suitable for our specific requirement. This means I no longer have a full sized environment to repeat the test for you. The failure was consistent and repeatable, and testing showed that it was directly related to the number of where clauses. We modified our SQL handler to split complex statements (i.e. >n clauses) into multiple statements, which fixed the issue in our application layer. Once we had made this change, we ran without a repeat of the issue for over three months. Regards, Jeff.

            Show
            mariadb@aquabolt.com Jeff Armstrong added a comment - After running our trial for about 10 months, we decided that the Maria+Galera combo was not suitable for our specific requirement. This means I no longer have a full sized environment to repeat the test for you. The failure was consistent and repeatable, and testing showed that it was directly related to the number of where clauses. We modified our SQL handler to split complex statements (i.e. >n clauses) into multiple statements, which fixed the issue in our application layer. Once we had made this change, we ran without a repeat of the issue for over three months. Regards, Jeff.
            Hide
            nirbhay_c Nirbhay Choubey added a comment -

            Ok, thanks!

            Show
            nirbhay_c Nirbhay Choubey added a comment - Ok, thanks!
            Hide
            nirbhay_c Nirbhay Choubey added a comment -

            BTW, do you have the error log by any chance? I would be interested in the log around signal 6.

            Show
            nirbhay_c Nirbhay Choubey added a comment - BTW, do you have the error log by any chance? I would be interested in the log around signal 6.

              People

              • Assignee:
                nirbhay_c Nirbhay Choubey
                Reporter:
                mariadb@aquabolt.com Jeff Armstrong
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: