[ClusterLabs] Pacemaker trades restart for migration on resource parameter change

Fri Mar 13 04:23:53 EDT 2020

"Ulrich Windl" <Ulrich.Windl at rz.uni-regensburg.de> writes:

> <wferi at niif.hu> schrieb am 13.03.2020 um 08:36 in Nachricht <19666_1584085000_5E6B3808_19666_1044_1_87imj8vh90.fsf at lant.ki.iif.hu>:
> 
>> A user noticed that after changing a non‑reloadable (unique) parameter
>> of resource A in our cluster, A wasn't restarted as expected.  On closer
>> inspection it turned out that the parameter change was coupled with a
>> utilization change as well, which necessitated shuffling resources
>> around.  All but a few resources have allow‑migrate set to true.
>
> It won't help you, but I also found it rather difficult to debug issues
> related to unresolvable utilization constraints, i.e. there is not good message
> when a resource cannot start due to utilization constraints.

No message at all. :)  But crm_simulate -LRU shows the remaining
capacities per node, and if all of them are less than the corresponding
utilization value of the resource, you've got one reason for it being
stopped.  But this wasn't the problem in my case, Pacemaker actually did
better than I expected by migrating another resource to make room for
the changed one, just wrongly optimized the final move into a migration.
I'd be interested to learn what effort Pacemaker makes in such situations
(acknowledging that the knapstack problem is NP-hard in general).
-- 
Feri