[ClusterLabs] [IMPORTANT] Fatal, yet rare issue verging on libqb's design flaw and/or it's use corosync around daemon-forking

Mon Jan 22 03:05:21 EST 2018

It was discovered that corosync exposes itself for a self-crash
under rare circumstance whereby corosync executable is run when there
is already a daemon instance around (does not apply to corosync serving
without any backgrounding, i.e. launched with "-f" switch).

Such a circumstance can be provoked unattendedly by the third party,
incl. "corosync -v" probe triggered internally by pcs (since 9e19af58
~ 0.9.145), which is what makes the root cause analysis of such
inflicted crash somewhat difficult to guess & analyze (the other
reason may be rather runaway core dump if produced at all due to
fencing coming, based on the few observed cases).

The problems comes from the fact that corosync is arranged such that
the logging is set up very early, even before the main control flow
of the program starts.  And part of this early enabling is also
starting "blackbox" recording, which uses mmap'd file stored in
/dev/shm that, moreover, only varies on PID that is part of the file
name -- and when corosync perform the fork so as to detach itself
from the environment it started it, such PID is free to be reused.
And against all odds, when that happens with this fresh new corosync
process, it happily mangles the file underneath the former daemon one,
leading to crashes indicated by SIGBUS, rarely also SIGFPE.

* * *

There are two quick mitigation techniques that can be readily applied:

1. make on-PATH corosync executable rather a "careful" wrapper:

   cp -a /sbin/corosync /sbin/corosync.orig
   > /sbin/corosync cat <<EOF
   #!/bin/sh
   test "\$1" != -v || { echo "$(/sbin/corosync.orig -v)"; exit 0; }
   exec /sbin/corosync.orig "\$@"
   EOF

   (when using SELinux, check the function and possibly fix the
   contexts on these files)

2. extend the PID space so as to move its wrap-around (precondition
   for reproducing the issue) further to the future (hence make the
   critical moments spread less frequently, lowering the overall
   probability), for instance with Linux kernel:

   echo 4194303 > /proc/sys/kernel/pid_max

* * *

The claim this problem is fixed, at least all three mentioned components
will have to do its part to limit the problem in the future:

- corosync (do something new after fork?)

- libqb (be more careful about the crashing condition?)

- pcs (either find a different way to check "is-old-stack", or double
  check if the probe's PID doesn't happen to hit the one baked in
  existing files in /dev/shm?)

so as to cover the-counterpart-not-up2date cases, and also will likely
lead to augmenting and/or overloading semantics of libqb's API.
All is being worked on, stay tuned.

-- 
Jan (Poki)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20180122/bf664d77/attachment-0002.sig>