[ClusterLabs] cluster doesn't do HA as expected, pingd doesn't help

Tue Dec 19 13:51:36 EST 2023

What if node (especially vm) freezes for several minutes and then continues 
to write to a shared disk where other nodes already put their data?
In my opinion, fencing, preferably two-level, is mandatory for lustre, 
trust me, I'd developed whole HA stack for both Exascaler and PangeaFS. 
We've seen so many points where data loss may occur...

On December 19, 2023 19:42:56 Artem <tyomikh at gmail.com> wrote:
> Andrei and Klaus thanks for prompt reply and clarification!
> As I understand, design and behavior of Pacemaker is tightly coupled with 
> the stonith concept. But isn't it too rigid?
>
> Is there a way to leverage self-monitoring or pingd rules to trigger 
> isolated node to umount its FS? Like vSphere High Availability host 
> isolation response.
> Can resource-stickiness=off (auto-failback) decrease risk of corruption by 
> unresponsive node coming back online?
> Is there a quorum feature not for cluster but for resource start/stop? Got 
> lock - is welcome to mount, unable to refresh lease - force unmount.
> Can on-fail=ignore break manual failover logic (stopped will be considered 
> as failed and thus ignored)?
>
> best regards,
> Artem
>
> On Tue, 19 Dec 2023 at 17:03, Klaus Wenninger <kwenning at redhat.com> wrote:
>
>
> On Tue, Dec 19, 2023 at 10:00 AM Andrei Borzenkov <arvidjaar at gmail.com> wrote:
> On Tue, Dec 19, 2023 at 10:41 AM Artem <tyomikh at gmail.com> wrote:
> ...
>> Dec 19 09:48:13 lustre-mds2.ntslab.ru pacemaker-schedulerd[785107] 
>> (update_resource_action_runnable)    warning: OST4_stop_0 on lustre4 is 
>> unrunnable (node is offline)
>> Dec 19 09:48:13 lustre-mds2.ntslab.ru pacemaker-schedulerd[785107] 
>> (recurring_op_for_active)    info: Start 20s-interval monitor for OST4 on 
>> lustre3
>> Dec 19 09:48:13 lustre-mds2.ntslab.ru pacemaker-schedulerd[785107] 
>> (log_list_item)      notice: Actions: Stop       OST4        (     lustre4 
>> )  blocked
>
> This is the default for the failed stop operation. The only way
> pacemaker can resolve failure to stop a resource is to fence the node
> where this resource was active. If it is not possible (and IIRC you
> refuse to use stonith), pacemaker has no other choice as to block it.
> If you insist, you can of course sert on-fail=ignore, but this means
> unreachable node will continue to run resources. Whether it can lead
> to some corruption in your case I cannot guess.
>
> Don't know if I'm reading that correctly but I understand what you had written
> above that you try to trigger the failover by stopping the VM (lustre4) without
> ordered shutdown.
> With fencing disabled what we are seeing is exactly what we would expect:
> The state of the resource is unknown - pacemaker tries to stop it - doesn't 
> work
> as the node is offline - no fencing configured - so everything it can do is 
> wait
> till there is info if the resource is up or not.
> I guess the strange output below is because of fencing disabled - quite an
> unusual - also not recommended - configuration and so this might not have
> shown up too often in that way.
>
> Klaus
>
>> Dec 19 09:48:13 lustre-mds2.ntslab.ru pacemaker-schedulerd[785107] 
>> (pcmk__create_graph)         crit: Cannot fence lustre4 because of OST4: 
>> blocked (OST4_stop_0)
>
> That is a rather strange phrase. The resource is blocked because the
> pacemaker could not fence the node, not the other way round.
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20231219/b4857639/attachment-0001.htm>