Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-7174

perfschema.global_read_lock fails in buildbot

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 10.1
    • Fix Version/s: 10.1.3
    • Component/s: Tests
    • Labels:

      Description

      http://buildbot.askmonty.org/buildbot/builders/kvm-deb-precise-amd64/builds/3551/steps/test_4/logs/stdio

      perfschema.global_read_lock              w3 [ fail ]
              Test ended at 2014-11-23 03:31:21
      
      CURRENT_TEST: perfschema.global_read_lock
      --- /usr/share/mysql/mysql-test/suite/perfschema/r/global_read_lock.result	2014-11-22 23:33:32.000000000 +0200
      +++ /run/shm/var/3/log/global_read_lock.reject	2014-11-23 03:31:21.547093061 +0200
      @@ -18,13 +18,16 @@
       unlock tables;
       lock tables performance_schema.setup_instruments write;
       connection default;
      +Timeout in wait_condition.inc for select 1 from performance_schema.events_waits_current where event_name like "wait/synch/cond/sql/MDL_context::COND_wait_status"
      +Id	User	Host	db	Command	Time	State	Info	Progress
      +12470	root	localhost	performance_schema	Query	0	init	show full processlist	0.000
      +12471	pfsuser	localhost	test	Query	30	Waiting for global read lock	lock tables performance_schema.setup_instruments write	0.000
       select event_name,
       left(source, locate(":", source)) as short_source,
       timer_end, timer_wait, operation
       from performance_schema.events_waits_current
       where event_name like "wait/synch/cond/sql/MDL_context::COND_wait_status";
       event_name	short_source	timer_end	timer_wait	operation
      -wait/synch/cond/sql/MDL_context::COND_wait_status	mdl.cc:	NULL	NULL	timed_wait
       unlock tables;
       update performance_schema.setup_instruments set enabled='NO';
       update performance_schema.setup_instruments set enabled='YES';
      
      mysqltest: Result length mismatch
      

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

              Hide
              elenst Elena Stepanova added a comment -

              The sequence that triggers the failure is perl ./mtr --noreorder main.mdev-504 perfschema.global_read_lock

              Show
              elenst Elena Stepanova added a comment - The sequence that triggers the failure is perl ./mtr --noreorder main.mdev-504 perfschema.global_read_lock
              Hide
              elenst Elena Stepanova added a comment - - edited

              Here is the essential parts of the test put together to get the same failure:

              --let $trial = 10000
              
              while ($trial)
              {
                --connect (con3,localhost,root,,)
                --disconnect con3
                --dec $trial
              }
              
              --connection default
              
              use performance_schema;
              
              update performance_schema.setup_instruments set enabled='YES';
              
              connect (con1, localhost, root, , test);
              
              lock tables performance_schema.setup_instruments read;
              --disable_result_log
              select * from performance_schema.setup_instruments;
              --enable_result_log
              unlock tables;
              
              lock tables performance_schema.setup_instruments write;
              update performance_schema.setup_instruments set enabled='NO';
              update performance_schema.setup_instruments set enabled='YES';
              unlock tables;
              
              --echo connection default;
              connection default;
              
              flush tables with read lock;
              
              connection con1;
              
              --send
              lock tables performance_schema.setup_instruments write;
              
              connection default;
              
              let $wait_condition= select 1 from performance_schema.events_waits_current where event_name like "wait/synch/cond/sql/MDL_context::COND_wait_status";
              let $wait_timeout= 5;
              
              --source include/wait_condition.inc
              
              unlock tables;
              

              It goes all right on 10.0, but on 10.1 it causes a timeout in wait_condition.
              That's because instead of the expected wait/synch/cond/sql/MDL_context::COND_wait_status we are now getting wait/synch/mutex/sql/MDL_wait::LOCK_wait_status.

              For me, it starts happening on 10.1 tree with revision 3f2d9a902ec93327515ae94ae0c8c0c2c485d15f. On the previous revision f1afc003eefe0aafd3e070c7453d9e029d8445a8 there is no timeout.

              I am not sure whether it's expected or not, because my further attempt to investigate got lost in git magic. If I look at the revision 3f2d9a902ec93327515ae94ae0c8c0c2c485d15f, it appears to be just a tiny change in an unrelated test case, which couldn't possibly make a difference.
              But if I do a git diff between f1afc003eefe0aafd3e070c7453d9e029d8445a8 and 3f2d9a902ec93327515ae94ae0c8c0c2c485d15f, I get a huge diff (like 400K lines).

              Show
              elenst Elena Stepanova added a comment - - edited Here is the essential parts of the test put together to get the same failure: --let $trial = 10000 while ($trial) { --connect (con3,localhost,root,,) --disconnect con3 --dec $trial } --connection default use performance_schema; update performance_schema.setup_instruments set enabled='YES'; connect (con1, localhost, root, , test); lock tables performance_schema.setup_instruments read; --disable_result_log select * from performance_schema.setup_instruments; --enable_result_log unlock tables; lock tables performance_schema.setup_instruments write; update performance_schema.setup_instruments set enabled='NO'; update performance_schema.setup_instruments set enabled='YES'; unlock tables; --echo connection default; connection default; flush tables with read lock; connection con1; --send lock tables performance_schema.setup_instruments write; connection default; let $wait_condition= select 1 from performance_schema.events_waits_current where event_name like "wait/synch/cond/sql/MDL_context::COND_wait_status"; let $wait_timeout= 5; --source include/wait_condition.inc unlock tables; It goes all right on 10.0, but on 10.1 it causes a timeout in wait_condition. That's because instead of the expected wait/synch/cond/sql/MDL_context::COND_wait_status we are now getting wait/synch/mutex/sql/MDL_wait::LOCK_wait_status . For me, it starts happening on 10.1 tree with revision 3f2d9a902ec93327515ae94ae0c8c0c2c485d15f . On the previous revision f1afc003eefe0aafd3e070c7453d9e029d8445a8 there is no timeout. I am not sure whether it's expected or not, because my further attempt to investigate got lost in git magic. If I look at the revision 3f2d9a902ec93327515ae94ae0c8c0c2c485d15f , it appears to be just a tiny change in an unrelated test case, which couldn't possibly make a difference. But if I do a git diff between f1afc003eefe0aafd3e070c7453d9e029d8445a8 and 3f2d9a902ec93327515ae94ae0c8c0c2c485d15f , I get a huge diff (like 400K lines).
              Hide
              elenst Elena Stepanova added a comment -

              After resolving git mystery with serg's help, I got another suspect for breaking the test:

              commit ab150128ce78fd363f6041277862686a61730b2b
              Merge: 9534fd8 20e20f6
              Author: Jan Lindström <jan.lindstrom@skysql.com>
              Date:   Wed Aug 27 13:15:37 2014 +0300
              
                  MDEV-6247: Merge 10.0-galera to 10.1.
                  
                      Merged lp:maria/maria-10.0-galera up to revision 3880.
                  
                      Added a new functions to handler API to forcefully abort_transaction,
                      producing fake_trx_id, get_checkpoint and set_checkpoint for XA. These
                      were added for future possiblity to add more storage engines that
                      could use galera replication.
              
              Show
              elenst Elena Stepanova added a comment - After resolving git mystery with serg's help, I got another suspect for breaking the test: commit ab150128ce78fd363f6041277862686a61730b2b Merge: 9534fd8 20e20f6 Author: Jan Lindström <jan.lindstrom@skysql.com> Date: Wed Aug 27 13:15:37 2014 +0300 MDEV-6247: Merge 10.0-galera to 10.1. Merged lp:maria/maria-10.0-galera up to revision 3880. Added a new functions to handler API to forcefully abort_transaction, producing fake_trx_id, get_checkpoint and set_checkpoint for XA. These were added for future possiblity to add more storage engines that could use galera replication.

                People

                • Assignee:
                  serg Sergei Golubchik
                  Reporter:
                  elenst Elena Stepanova
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  2 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved:

                    Time Tracking

                    Estimated:
                    Original Estimate - Not Specified
                    Not Specified
                    Remaining:
                    Remaining Estimate - 0 minutes
                    0m
                    Logged:
                    Time Spent - 3 hours
                    3h