[ClusterLabs] Antw: [EXT] Re: resource management of standby node

Mon Nov 30 09:05:03 EST 2020

>>> Andrei Borzenkov <arvidjaar at gmail.com> schrieb am 30.11.2020 um 14:18 in
Nachricht
<CAA91j0XLfztbSmCkDGM0Ofb2FBKCquwyhiEU8LV5WgiUU3H=iA at mail.gmail.com>:
> On Mon, Nov 30, 2020 at 3:11 PM Ulrich Windl
> <Ulrich.Windl at rz.uni‑regensburg.de> wrote:
>>
>> Hi!
>>
>> In SLES15 I'm surprised what a standby node does: My guess was that a 
> standby node would stop all resources and then just "shut up", but it seems

> it still tried to place resources and calls monitor operations.
>>
> 
> Standby nodes are ineligible for running resources. It does not stop
> pacemaker from trying to place resources somewhere in cluster.

But it's somewhat ridiculous if all nodes are in standby.

> 
>> Like this after a configuration change:
>> pacemaker‑controld[49413]:  notice: Result of probe operation for 
> prm_test_raid_md1 on h18: not running
>>
> 
> Probe is not monitor. Normally it happens once when the pacemaker is
> started. It should not really be affected by putting node in standby.

A configuration change triggered it. Again: It makes little sense if all nodes
are in standby: What action would be performed depending on the result of the
probe? None, I guess; so why probe?

> 
>> Or this (on the DC node):
>> pacemaker‑schedulerd[69599]:  notice: Cannot pair prm_test_raid_md1:0 with

> instance of cln_DLM
>>
> 
> So? As mentioned, pacemaker still attempts to manage resources, it
> just excludes standby nodes from the list of possible candidates. If

Buit there are no (zero) candidates!

> all nodes are in standby mode, no resource can run anywhere, but
> pacemaker still needs to try placing resources to see it. Maybe you
> really want cluster maintenance mode instead.

I thought about that:
First put all nodes in standby to stop resources, then put all nodes in
maintenance mode, then edit configuration.
Then turn off maintenance mode for all nodes, then put them online again.

Soounds somewhat complicated.

> 
>> Maybe I should have done differently, but after a test setup I noticed that

> I named by primitives in a non‑consistent way, and wanted to mass‑rename 
> resources.
>> As from the past renaming running resources had issues, I wanted to stop
all 
> resources before changing the configuration.
>> So I was expecting the cluster to be silent until I put at least one node 
> online again.
>>
>> Expectation failed. Is there a better way to do it?
>>
>> Regards,
>> Ulrich
>>
>>
>> _______________________________________________
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>
>> ClusterLabs home: https://www.clusterlabs.org/ 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/