[ClusterLabs] Pacemaker trades restart for migration on resource parameter change

Fri Mar 13 03:36:11 EDT 2020

Hi,

A user noticed that after changing a non-reloadable (unique) parameter
of resource A in our cluster, A wasn't restarted as expected.  On closer
inspection it turned out that the parameter change was coupled with a
utilization change as well, which necessitated shuffling resources
around.  All but a few resources have allow-migrate set to true.
Pacemaker decided to migrate resource B to make room for the grown A,
then migrate A to's previous node.  And that's it; it didn't restart A,
A kept running with the old parameter value, until I manually restarted
it later.

It happened with Pacemaker 1.1.16.  A further detail which might play a
role in the above: resource parameter modification in this setup is a
multi-step process to provide mutual exclusion.  First we create a dummy
unmanaged "lock" resource, then a shadow CIB, in which the parameter and
utilization changes are made and a simulation is run, then we commit the
shadow CIB and finally delete the "lock" resource.  This means that the
first transition triggered by the shadow commit is immediately aborted
by the "lock" removal on its heels, but surprisingly, the logs don't
separate these cleanly (A=vm-eduid-node5, B=vm-pws-web5, vm-rad* are
irrelevant, the 8 cluster nodes are vhbl0[1-8]):

16:25:42 crmd[11822]:   notice: State transition S_IDLE -> S_POLICY_ENGINE
16:25:43 pengine[11821]:  warning: Processing failed op monitor for vm-rad02-vh-dmz-sulinet-hu-eduroam on vhbl03: not running (7)
16:25:43 pengine[11821]:  warning: Processing failed op monitor for vm-rad03-vh-dmz-sulinet-hu-eduroam on vhbl03: not running (7)
16:25:43 pengine[11821]:   notice: Migrate vm-eduid-node5#011(Started vhbl07 -> vhbl08)
16:25:43 pengine[11821]:   notice: Migrate vm-pws-web5#011(Started vhbl08 -> vhbl04)
16:25:43 pengine[11821]:   notice: Calculated transition 4376, saving inputs in /var/lib/pacemaker/pengine/pe-input-1584.bz2
16:25:43 pengine[11821]:  warning: Processing failed op monitor for vm-rad02-vh-dmz-sulinet-hu-eduroam on vhbl03: not running (7)
16:25:43 pengine[11821]:  warning: Processing failed op monitor for vm-rad03-vh-dmz-sulinet-hu-eduroam on vhbl03: not running (7)
16:25:44 pengine[11821]:   notice: Removing CIB_LOCK from vhbl01
16:25:44 pengine[11821]:   notice: Removing CIB_LOCK from vhbl02
16:25:44 pengine[11821]:   notice: Removing CIB_LOCK from vhbl03
16:25:44 pengine[11821]:   notice: Removing CIB_LOCK from vhbl04
16:25:44 pengine[11821]:   notice: Removing CIB_LOCK from vhbl06
16:25:44 pengine[11821]:   notice: Removing CIB_LOCK from vhbl05
16:25:44 pengine[11821]:   notice: Removing CIB_LOCK from vhbl07
16:25:44 pengine[11821]:   notice: Removing CIB_LOCK from vhbl08
16:25:44 pengine[11821]:   notice: Migrate vm-eduid-node5#011(Started vhbl07 -> vhbl08)
16:25:44 pengine[11821]:   notice: Migrate vm-pws-web5#011(Started vhbl08 -> vhbl04)
16:25:44 pengine[11821]:   notice: Calculated transition 4377, saving inputs in /var/lib/pacemaker/pengine/pe-input-1585.bz2
16:25:44 crmd[11822]:   notice: Initiating delete operation CIB_LOCK_delete_0 locally on vhbl08
16:25:44 crmd[11822]:   notice: Initiating delete operation CIB_LOCK_delete_0 on vhbl07
16:25:44 crmd[11822]:   notice: Initiating delete operation CIB_LOCK_delete_0 on vhbl05
16:25:44 crmd[11822]:   notice: Initiating delete operation CIB_LOCK_delete_0 on vhbl06
16:25:44 crmd[11822]:   notice: Initiating delete operation CIB_LOCK_delete_0 on vhbl04
16:25:44 crmd[11822]:   notice: Initiating delete operation CIB_LOCK_delete_0 on vhbl03
16:25:44 crmd[11822]:   notice: Transition aborted by deletion of lrm_resource[@id='CIB_LOCK']: Resource state removal
16:25:45 crmd[11822]:   notice: Transition 4377 (Complete=12, Pending=0, Fired=0, Skipped=3, Incomplete=15, Source=/var/lib/pacemaker/pengine/pe-input-1585.bz2): Stopped
16:25:46 pengine[11821]:  warning: Processing failed op monitor for vm-rad02-vh-dmz-sulinet-hu-eduroam on vhbl03: not running (7)
16:25:46 pengine[11821]:  warning: Processing failed op monitor for vm-rad03-vh-dmz-sulinet-hu-eduroam on vhbl03: not running (7)
16:25:46 pengine[11821]:   notice: Removing CIB_LOCK from vhbl01
16:25:46 pengine[11821]:   notice: Removing CIB_LOCK from vhbl02
16:25:46 pengine[11821]:   notice: Migrate vm-eduid-node5#011(Started vhbl07 -> vhbl08)
16:25:46 pengine[11821]:   notice: Migrate vm-pws-web5#011(Started vhbl08 -> vhbl04)
16:25:46 pengine[11821]:   notice: Calculated transition 4378, saving inputs in /var/lib/pacemaker/pengine/pe-input-1586.bz2
16:25:46 crmd[11822]:   notice: Initiating delete operation CIB_LOCK_delete_0 on vhbl02
16:25:46 crmd[11822]:   notice: Initiating delete operation CIB_LOCK_delete_0 on vhbl01
16:25:46 crmd[11822]:   notice: Initiating migrate_to operation vm-pws-web5_migrate_to_0 locally on vhbl08

This looks like a resource management bug to me, but maybe we're doing
something wrong (certainly not optimally, please forgive that part).
Detail logs, pe-input and cib files are still around, but I need advice
about where to dig, so I'll be grateful for your comments.
-- 
Thanks,
Feri