Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-360

safe_mutex: Trying to destroy a mutex keycache->cache_lock that was locked

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.5.27
    • Component/s: None
    • Labels:
      None

      Description

      RQG test crashes with assetion in the safemutex code

      http://buildbot.askmonty.org/buildbot/builders/rqg-perpush-bugfix-tests/builds/25/steps/rqg_bugfix_tests/logs/stdio

      the crash callstack points to "repartition_key_cache" function.

      mysys/thr_mutex.c:608(safe_mutex_destroy)[0xc39f1d]
      psi/mysql_thread.h:597(inline_mysql_mutex_destroy)[0xc1071e]
      mysys/mf_keycache.c:1002(end_simple_key_cache)[0xc1066d]
      mysys/mf_keycache.c:5342(end_partitioned_key_cache)[0xc17814]
      mysys/mf_keycache.c:6109(end_key_cache_internal)[0xc184ae]
      mysys/mf_keycache.c:6476(repartition_key_cache_internal)[0xc188ea]
      mysys/mf_keycache.c:6527(repartition_key_cache)[0xc1898b]

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            elenst Elena Stepanova added a comment - - edited

            FYI, the RQG test in question was added as a regression test for LP:1008293. It runs the same 2 grammars that were provided in the bug report.

            Show
            elenst Elena Stepanova added a comment - - edited FYI, the RQG test in question was added as a regression test for LP:1008293. It runs the same 2 grammars that were provided in the bug report.
            Hide
            igor Igor Babaev added a comment -

            Elena,
            It's not clear from the above where the test was added.
            The original fix for LP:1008293 was pushed into 5.2. I don't see any failures in 5.2.

            Show
            igor Igor Babaev added a comment - Elena, It's not clear from the above where the test was added. The original fix for LP:1008293 was pushed into 5.2. I don't see any failures in 5.2.
            Hide
            elenst Elena Stepanova added a comment - - edited

            Igor,
            The test was added to 5.2, 5.3 and 5.5. It passed on 5.2 and 5.3 after the fix was pushed/merged in the corresponding tree, but failed on 5.5 with the failure Wlad mentioned above (safe_mutex: Trying to destroy a mutex keycache->cache_lock) – it's different from the initial crash.

            Please note however that the new failure is sporadic, so unless you can guess a source of it by just looking at the stack trace, you'll probably want to assign it to me and wait till I come up with a test case for it (which might take time because from my previous experience, these destroying mutex race conditions might be not easy to catch).

            Show
            elenst Elena Stepanova added a comment - - edited Igor, The test was added to 5.2, 5.3 and 5.5. It passed on 5.2 and 5.3 after the fix was pushed/merged in the corresponding tree, but failed on 5.5 with the failure Wlad mentioned above (safe_mutex: Trying to destroy a mutex keycache->cache_lock) – it's different from the initial crash. Please note however that the new failure is sporadic, so unless you can guess a source of it by just looking at the stack trace, you'll probably want to assign it to me and wait till I come up with a test case for it (which might take time because from my previous experience, these destroying mutex race conditions might be not easy to catch).
            Hide
            igor Igor Babaev added a comment -

            I would prefer to have a test case to start working on this bug.

            Show
            igor Igor Babaev added a comment - I would prefer to have a test case to start working on this bug.
            Hide
            elenst Elena Stepanova added a comment -

            Igor,

            Please try the MTR test case below. It crashes on two machines out of 3 that i tried (the 3rd is a slow 32-bit box, not sure whether it's slowness or the bits that stop it from crashing).
            Please run the test with --repeat=100. (It usually fails for me in the first 10 repetitions)

            1. MTR test case

            CREATE TABLE t1 (a INT, b DATE, KEY(a), KEY(b)) ENGINE=MyISAM;
            INSERT INTO t1 VALUES (8, '2008-10-02');
            --send SET GLOBAL key_cache_segments = 1
            --connect (con8,127.0.0.1,root,,test)
            SET GLOBAL keycache1.key_buffer_size = 1024*1024;
            --send CACHE INDEX t1 IN keycache1
            --connection default
            --reap
            SET GLOBAL key_cache_segments = 7;
            --connection con8
            --reap

            1. End of MTR test case
            1. If it does not work, please try to use the following RQG grammar
            2. (it's one of the grammars from lp:1008293).
            3. cat 3.yy

            query_init:
            SET GLOBAL keycache1.key_buffer_size = 1024*1024;

            thread1:
            SET GLOBAL key_cache_segments = _digit;

            query:
            CACHE INDEX _table IN keycache1;

            1. end of RQG grammar 3.yy
            1. Run it as

            perl runall.pl \
            --no-mask \
            --queries=100M \
            --duration=300 \
            --threads=2 \
            --engine=MyISAM \
            --grammar=3.yy \
            --basedir=<your basedir> --vardir=<your vardir>

            1. Or, on an already started server, as

            perl gentest.pl \
            --gendata= \
            --engine=MyISAM \
            --threads=2 \
            --queries=100M \
            --duration=300 \
            --grammar=3.yy \
            --dsn=dbi:mysql:host=127.0.0.1:port=19300:user=root:database=test

            (replace 19300 with your port).

            Again, normally it fails within seconds after start, but sometimes it does not.

            If neither of this works for you, please let me know.

            Show
            elenst Elena Stepanova added a comment - Igor, Please try the MTR test case below. It crashes on two machines out of 3 that i tried (the 3rd is a slow 32-bit box, not sure whether it's slowness or the bits that stop it from crashing). Please run the test with --repeat=100. (It usually fails for me in the first 10 repetitions) MTR test case CREATE TABLE t1 (a INT, b DATE, KEY(a), KEY(b)) ENGINE=MyISAM; INSERT INTO t1 VALUES (8, '2008-10-02'); --send SET GLOBAL key_cache_segments = 1 --connect (con8,127.0.0.1,root,,test) SET GLOBAL keycache1.key_buffer_size = 1024*1024; --send CACHE INDEX t1 IN keycache1 --connection default --reap SET GLOBAL key_cache_segments = 7; --connection con8 --reap End of MTR test case If it does not work, please try to use the following RQG grammar (it's one of the grammars from lp:1008293). cat 3.yy query_init: SET GLOBAL keycache1.key_buffer_size = 1024*1024; thread1: SET GLOBAL key_cache_segments = _digit; query: CACHE INDEX _table IN keycache1; end of RQG grammar 3.yy Run it as perl runall.pl \ --no-mask \ --queries=100M \ --duration=300 \ --threads=2 \ --engine=MyISAM \ --grammar=3.yy \ --basedir=<your basedir> --vardir=<your vardir> Or, on an already started server, as perl gentest.pl \ --gendata= \ --engine=MyISAM \ --threads=2 \ --queries=100M \ --duration=300 \ --grammar=3.yy \ --dsn=dbi:mysql:host=127.0.0.1:port=19300:user=root:database=test (replace 19300 with your port). Again, normally it fails within seconds after start, but sometimes it does not. If neither of this works for you, please let me know.
            Hide
            elenst Elena Stepanova added a comment -

            Algrorithm to start the MTR test above:

            • copy the test case into t/t1.test
            • run
              perl ./mtr --repeat=100 t1
            Show
            elenst Elena Stepanova added a comment - Algrorithm to start the MTR test above: copy the test case into t/t1.test run perl ./mtr --repeat=100 t1
            Hide
            igor Igor Babaev added a comment -

            The fix was applied to 5.2, them merged into 5.3 and 5.5.
            The problem was not observed anymore.

            Show
            igor Igor Babaev added a comment - The fix was applied to 5.2, them merged into 5.3 and 5.5. The problem was not observed anymore.

              People

              • Assignee:
                igor Igor Babaev
                Reporter:
                wlad Vladislav Vaintroub
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: