Details
Description
There was an attempt to fix this in MDEV-7448, but it didn't work.
When bb slave looses connection to master it seem to send SIGKILL to process group:
2015-03-02 19:03:15-0500 [Broker,client] lost remote 2015-03-02 19:03:15-0500 [Broker,client] lost remote step 2015-03-02 19:03:15-0500 [Broker,client] stopCommand: halting current command <buildslave.commands.shell.SlaveShellCommand instance at 0x10004609998> 2015-03-02 19:03:15-0500 [Broker,client] command interrupted, attempting to kill 2015-03-02 19:03:15-0500 [Broker,client] trying to kill process group 17399 2015-03-02 19:03:15-0500 [Broker,client] signal 9 sent successfully 2015-03-02 19:03:15-0500 [Broker,client] Lost connection to buildbot.askmonty.org:9989
mysqld is most probably detached from that process group. At least mysql-test/lib/My/SafeProcess/safe_process.cc calls setpgid(). As a result signal won't reach mysqld.
This means that intermediate processes must catch SIGKILL and forward it to mysqld. But SIGKILL cannot be caught.
I can think of 3 ways of fixing this problem:
- let bb use different signal when killing process group (SIGINT, SIGHUP)
- don't detach mysqld from bb process group
- somehow let bb record and kill mysqld process
Gliffy Diagrams
Attachments
Issue Links
- relates to
-
MDEV-7448 mtr may leave stale mysqld
-
- Closed
-
Activity
- All
- Comments
- Work Log
- History
- Activity
- Transitions
Just for the record: there seem to be bb slave interruptSignal option. It should likely be possible to specify it on a per-step basis on master.
OTOH we could probably use SIGTERM: it should be fine for build step too.
There're no stale mysqld after "killall -g -s SIGTERM mtr", but if I use SIGKILL there is.