[ClusterLabs] Antw: [EXT] Re: Stopping all nodes causes servers to migrate

Mon Jan 25 04:22:20 EST 2021

>>> Jehan-Guillaume de Rorthais <jgdr at dalibo.com> schrieb am 25.01.2021 um
09:51 in
Nachricht <20210125095132.575f55aa at firost>:
> Hi Digimer,
> 
> On Sun, 24 Jan 2021 15:31:22 ‑0500
> Digimer <lists at alteeve.ca> wrote:
> [...]
>>  I had a test server (srv01‑test) running on node 1 (el8‑a01n01), and on
>> node 2 (el8‑a01n02) I ran 'pcs cluster stop ‑‑all'.
>> 
>>   It appears like pacemaker asked the VM to migrate to node 2 instead of
>> stopping it. Once the server was on node 2, I couldn't use 'pcs resource
>> disable <vm>' as it returned that that resource was unmanaged, and the
>> cluster shut down was hung. When I directly stopped the VM and then did
>> a 'pcs resource cleanup', the cluster shutdown completed.
> 
> As actions during a cluster shutdown cannot be handled in the same 
> transition
> for each nodes, I usually add a step to disable all resources using
property
> "stop‑all‑resources" before shutting down the cluster:
> 
>   pcs property set stop‑all‑resources=true
>   pcs cluster stop ‑‑all
> 
> But it seems there's a very new cluster property to handle that (IIRC, one 
> or
> two releases ago). Look at "shutdown‑lock" doc:
> 
>   [...]
>   some users prefer to make resources highly available only for failures, 
> with
>   no recovery for clean shutdowns. If this option is true, resources active

> on a
>   node when it is cleanly shut down are kept "locked" to that node (not 
> allowed
>   to run elsewhere) until they start again on that node after it rejoins
(or
>   for at most shutdown‑lock‑limit, if set).
>   [...]
> 
> [...]
>>   So as best as I can tell, pacemaker really did ask for a migration. Is
>> this the case?
> 
> AFAIK, yes, because each cluster shutdown request is handled independently 
> at
> node level. There's a large door open for all kind of race conditions if
> requests are handled with some random lags on each nodes.

Maybe it's time for <configuration><cluster
target-role=stopped">...</configuration> in CIB ;-)
So many dubious new features are being implemented, but such essential things
aren't...

Regards,
Ulrich