Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5627

mariadb innodb indexes stops working in maria 5.5.29

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Incomplete
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
    • Environment:
      Linux , Maria DB 5.5.29 , Innodb Table, latin1 character Set

      Description

      I am experiencing weird behaviour from my MySQL/MariaDB InnoDB/XtraDB. Recently switched to MariaDB 5.5.29 recently The switch made the server to perform lot better but I still have this problem.

      One particular tables index seems to break every now and then. And after a while, it fixes it self or I have to run optimize table by myself. I am not able to figure out the exact reason why it happens and really tried of each bit I could to do a root cause analysis to fix the problem.

      The table schema is as below.

      Record counts : 51088827

        CREATE TABLE `commprices` (
         `brand_name` varchar(78) NOT NULL,
         `product_code` char(20) NOT NULL,
         `market_name` varchar(40) NOT NULL,
         `source_name` varchar(12) NOT NULL,
         `date` datetime NOT NULL,
         `frequency` varchar(2) NOT NULL,
         `average_price` decimal(20,4) DEFAULT NULL,
         `arrivals` decimal(20,4) DEFAULT NULL,
         `org_price` decimal(20,4) DEFAULT NULL,
         `mtime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE
        CURRENT_TIMESTAMP,
         PRIMARY KEY (`brand_name`,`product_code`,`market_name`,`source_name`,`date`,`frequency`),
         KEY `icommprices_pcode_dt_freq` (`product_code`,`source_name`,`date`,`frequency`),
         KEY `icommprices_pcode_dt_bname` (`brand_name`,`product_code`,`source_name`,`date`)
        ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
      

      Actual Problem :
      =============

      We have a SQL query which I can share if needed. The database server at times ignores the indexes and does for a full table scan randomly. The no. of rows fetched should be 599 when using the index. 52798095 records are fetched while doing a table scan. The order of index also differs in both these cases. I have attached the explain of the query when it uses the indexes and when it goes for a full table scan.

      I am not able to generate this behavior at will thats why not able to find a solution for that. For time being we have addressed this issue by truncating the 80% of records. Kindly review this issue and let me know whats causing this strange behavior and what could be a possible solution for this.

      Thanks,
      Santhinesh

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            elenst Elena Stepanova added a comment -

            Hi,

            Execution plan depends on statistics that the server currently has for tables. In case of InnoDB, statistics can be re-collected in several cases, e.g. if the table data gets updated considerably, or if you run ANALYZE/OPTIMIZE, or on SHOW TABLE STATUS.
            Since you are saying that the plan breaks and sometimes gets fixed on its own, most likely it is caused by the automatic statistics update.
            To make sure of that, and to exclude this factor, please run

            set global innodb_stats_auto_update = 0; /* only with XtraDB */
            set global innodb_stats_on_metadata = 0; /* both for XtraDB and InnoDB */
            

            As a result, statistics should only be updated when ANALYZE TABLE (or OPTIMIZE TABLE) is ran or during first open of the tables, so you should have more control over the execution plan.

            Please note that it might happen that without statistics auto update, you might observe a reverse problem with this or other queries: existing query plans can become non-optimal with time, since the optimizer will try to use outdated statistics. In this case you will need to run ANALYZE TABLE manually.

            Please let us know if it helps. If it does not and you keep having the problem, please also provide the query itself, and output of SHOW KEYS IN <tablename> when the plan is bad, and when it's good again after you ran ANALYZE.

            Show
            elenst Elena Stepanova added a comment - Hi, Execution plan depends on statistics that the server currently has for tables. In case of InnoDB, statistics can be re-collected in several cases, e.g. if the table data gets updated considerably, or if you run ANALYZE/ OPTIMIZE , or on SHOW TABLE STATUS . Since you are saying that the plan breaks and sometimes gets fixed on its own, most likely it is caused by the automatic statistics update. To make sure of that, and to exclude this factor, please run set global innodb_stats_auto_update = 0; /* only with XtraDB */ set global innodb_stats_on_metadata = 0; /* both for XtraDB and InnoDB */ As a result, statistics should only be updated when ANALYZE TABLE (or OPTIMIZE TABLE) is ran or during first open of the tables, so you should have more control over the execution plan. Please note that it might happen that without statistics auto update, you might observe a reverse problem with this or other queries: existing query plans can become non-optimal with time, since the optimizer will try to use outdated statistics. In this case you will need to run ANALYZE TABLE manually. Please let us know if it helps. If it does not and you keep having the problem, please also provide the query itself, and output of SHOW KEYS IN <tablename> when the plan is bad, and when it's good again after you ran ANALYZE .
            Hide
            santhineshkumar Santhinesh Nagendran added a comment -

            Thanks a Lot Elena for the reply. This would certainly help but still I am not able to reproduce this particular scenario by my self. This is very frequently used table with load data and massive selects. I will record the keys status going after. This issue is fixed for now at our end by purging almost 80% of data.

            So is it because of this innodb_stats_auto_update and innodb_stats_on_metadata variables my query is performing bad ? When rest of my other queries on other tables works perfectly fine I am wondering what would cause this particular table gets problem and we do a manual optimize table once in 6-7 days when the concerned query take more time than usual.

            Show
            santhineshkumar Santhinesh Nagendran added a comment - Thanks a Lot Elena for the reply. This would certainly help but still I am not able to reproduce this particular scenario by my self. This is very frequently used table with load data and massive selects. I will record the keys status going after. This issue is fixed for now at our end by purging almost 80% of data. So is it because of this innodb_stats_auto_update and innodb_stats_on_metadata variables my query is performing bad ? When rest of my other queries on other tables works perfectly fine I am wondering what would cause this particular table gets problem and we do a manual optimize table once in 6-7 days when the concerned query take more time than usual.
            Hide
            elenst Elena Stepanova added a comment -

            still I am not able to reproduce this particular scenario by my self.
            So is it because of this innodb_stats_auto_update and innodb_stats_on_metadata variables my query is performing bad ? When rest of my other queries on other tables works perfectly fine I am wondering what would cause this particular table gets problem and we do a manual optimize table once in 6-7 days when the concerned query take more time than usual.

            This is not particularly strange that you cannot reproduce the problem reliably. If the current working theory is correct, it is sporadic by its nature.
            The idea behind default values of innodb_stats_auto_update/innodb_stats_on_metadata is this (optimistic) scenario:

            • on first opening of a table, InnoDB collects statistics for the table which it further uses for generating execution plan;
            • if the table is updated frequently, it might happen that the chosen plan will become non-optimal; so, if the data is modified significantly, InnoDB will re-collect statistics, and possibly another execution plan will be chosen.

            However, in some cases it can work backwards.
            While collecting statistics, InnoDB does not inspect the whole table – it would be way too expensive – but only a small random portion of the data; and based on this data it makes approximations for the whole table. Random is important part here. So, if the data lacks uniformity, it might happen that even two subsequent statistics updates on the same table will cause quite different execution plans.
            That's what, as we suspect, is happening in your case. With some – rather low – probability InnoDB picks a portion of data that represents the table very badly, and the bad plan is generated based on that. When the statistics is recollected next time, another part of data is used, so the plan changes back (it either happens automatically, or when you run OPTIMIZE, which is in fact ANALYZE + rebuild).

            If the theory about your problem is right, disabling automatic statistics update should exclude spontaneous change in the plan. However, it might have a side-effect due to which the automatic recollection was invented in the first place: it might happen that for this or other queries, initially generated execution plan will become suboptimal with time. In this case you will need to run ANALYZE TABLE for involved tables.

            If this happens, we will come up with other possible solutions; what we need to know now is that, indeed, the source of the problem is in statistics updates.

            I understand that with the truncated data you won't be observing the issue for a while – even if it happens, the difference is probably not perceivable. Please let us know when it happens again, or when you are satisfied with your observations.

            Show
            elenst Elena Stepanova added a comment - still I am not able to reproduce this particular scenario by my self. So is it because of this innodb_stats_auto_update and innodb_stats_on_metadata variables my query is performing bad ? When rest of my other queries on other tables works perfectly fine I am wondering what would cause this particular table gets problem and we do a manual optimize table once in 6-7 days when the concerned query take more time than usual. This is not particularly strange that you cannot reproduce the problem reliably. If the current working theory is correct, it is sporadic by its nature. The idea behind default values of innodb_stats_auto_update/innodb_stats_on_metadata is this (optimistic) scenario: on first opening of a table, InnoDB collects statistics for the table which it further uses for generating execution plan; if the table is updated frequently, it might happen that the chosen plan will become non-optimal; so, if the data is modified significantly, InnoDB will re-collect statistics, and possibly another execution plan will be chosen. However, in some cases it can work backwards. While collecting statistics, InnoDB does not inspect the whole table – it would be way too expensive – but only a small random portion of the data; and based on this data it makes approximations for the whole table. Random is important part here. So, if the data lacks uniformity, it might happen that even two subsequent statistics updates on the same table will cause quite different execution plans. That's what, as we suspect, is happening in your case. With some – rather low – probability InnoDB picks a portion of data that represents the table very badly, and the bad plan is generated based on that. When the statistics is recollected next time, another part of data is used, so the plan changes back (it either happens automatically, or when you run OPTIMIZE, which is in fact ANALYZE + rebuild). If the theory about your problem is right, disabling automatic statistics update should exclude spontaneous change in the plan. However, it might have a side-effect due to which the automatic recollection was invented in the first place: it might happen that for this or other queries, initially generated execution plan will become suboptimal with time. In this case you will need to run ANALYZE TABLE for involved tables. If this happens, we will come up with other possible solutions; what we need to know now is that, indeed, the source of the problem is in statistics updates. I understand that with the truncated data you won't be observing the issue for a while – even if it happens, the difference is probably not perceivable. Please let us know when it happens again, or when you are satisfied with your observations.
            Hide
            elenst Elena Stepanova added a comment -

            Closing for now as lacking new information, please comment to re-open if needed.

            Show
            elenst Elena Stepanova added a comment - Closing for now as lacking new information, please comment to re-open if needed.

              People

              • Assignee:
                elenst Elena Stepanova
                Reporter:
                santhineshkumar Santhinesh Nagendran
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Due:
                  Created:
                  Updated:
                  Resolved: