We're updating the issue view to help you get more done.Learn more

Incorrect barrier logic in mutex_exit path causes data corruption on AArch64

Hi,
We've run into an issue with MariaDB when running Sysbench "oltp.lua" test with 8 threads. The server daemon crashed mostly with an assertion failure at storage/xtradb/fil/fil0fil.c:5288:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 fil_node_complete_io( /*=================*/ fil_node_t* node, /*!< in: file node */ fil_system_t* system, /*!< in: tablespace memory cache */ ulint type) /*!< in: OS_FILE_WRITE or OS_FILE_READ; marks the node as modified if type == OS_FILE_WRITE */ { ut_ad(node); ut_ad(system); ut_ad(mutex_own(&(system->mutex))); ut_a(node->n_pending > 0); <-- failure point node->n_pending--;

An attached debugger gave the following backtrace:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 (gdb) bt full #0 0x0000007fb1d44d18 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 _sys_result = 0 pd = 0x7fa2fff1a0 pid = <optimised out> selftid = 5661 #1 0x0000007fb1d4818c in __GI_abort () at abort.c:89 save_stage = 2 act = {__sigaction_handler = {sa_handler = 0x7f00000000, sa_sigaction = 0x7f00000000}, sa_mask = {__val = {548445445976, 404, 1, 404, 0, 366924161824, 548195526816, 366921716804, 366927454208, 3, 0, 548434850424, 366927869208, 366936283408, 548195527344, 548434850424}}, sa_flags = 5288, sa_restorer = 0xa2fff1a0} sigs = {__val = {32, 0 <repeats 15 times>}} #2 0x000000556e3d1448 in fil_node_complete_io (node=<optimised out>, system=<optimised out>, type=<optimised out>) at /build/buildd/mariadb-5.5-5.5.36/storage/xtradb/fil/fil0fil.c:5288 No locals. #3 0x000000556e3db800 in fil_aio_wait (segment=segment@entry=3) at /build/buildd/mariadb-5.5-5.5.36/storage/xtradb/fil/fil0fil.c:5705 ret = <optimised out> fil_node = 0x7fb14a0e78 message = 0x7fa54d4350 type = 10 space_id = 0 #4 0x000000556e3592a4 in io_handler_thread (arg=<optimised out>) at /build/buildd/mariadb-5.5-5.5.36/storage/xtradb/srv/srv0start.c:486 segment = 3 #5 0x0000007fb220ae2c in start_thread (arg=0x7fa2fff1a0) at pthread_create.c:314 pd = 0x7fa2fff1a0 unwind_buf = {cancel_jmp_buf = {{jmp_buf = {548195529120, 548981639624, 548449456128, 0, 548449452032, 548195529312, 548195527344, 548443965168, 8388608, 548449472512, 548195527056, 13770210553321828185, 0, 13770210553602140361, 0, 0, 0, 0, 0, 0, 0, 0}, mask_was_saved = 0}}, priv = {pad = { 0x0, 0x0, 0x7fb220ad7c <start_thread>, 0x7fa2fff1a0}, data = { prev = 0x0, cleanup = 0x0, canceltype = -1306481284}}} not_first_call = 0 pagesize_m1 = <optimised out> sp = <optimised out> freesize = <optimised out> __PRETTY_FUNCTION__ = "start_thread" #6 0x0000007fb1dd9c40 in clone () at ../ports/sysdeps/unix/sysv/linux/aarch64/nptl/../clone.S:96 No locals.

Once the daemon crashed we've sometimes been unable to start it again without wiping out the database and re-installing it.

Having done some digging it is apparent that there is a problem in the mutex_exit code path; in particular at:
http://bazaar.launchpad.net/~maria-captains/maria/5.5/view/head:/storage/xtradb/include/sync0sync.ic#L106

A load-acquire is used to exit the mutex rather than a store-release. This leads to unpredictable results for architectures with a weak memory model.

We have the following in program order:

  1. mutex_enter -> load-acquire lock, loop until it is 0, then set to 1 relaxed

  2. protected work

  3. mutex_exit -> load-acquire lock, set it to 0 regardless.

However, the following sequence of events can be observed by another core:

  1. mutex_enter -> load-acquire lock, loop until it is 0, then set to 1 relaxed

  2. some of the protected work

  3. mutex_exit -> load-acquire lock, set it to 0 regardless.

  4. some more of the protected work (not protected).

The above can (and has for our test system) lead to severe data corruption; that prevents the daemon from even re-starting.

I've attached an emergency patch that re-introduces __ sync_lock_release to release the mutex. This fixes the crash and data corruption issues for me, but I understand from comments in the code that there were issues with this function in the past? Could the gcc intrinsics be moved over to the __ atomic_* functions? Ideally:

To acquire the lock:

1 __atomic_exchange_n(ptr, (byte) new_val, __ATOMIC_ACQUIRE)

To release the lock:

1 __atomic_store_n(ptr, (byte) new_val, __ATOMIC_RELEASE)

(which also worked on my test system).

I believe this issue may affect other versions of MariaDB, but I've only tested 5.5.36.

Cheers,

Steve Capper

Status

Assignee

Sergey Vojtovich

Reporter

Steve Capper

Fix versions

Affects versions

5.5.36

Priority

Critical