[ClusterLabs] Correctly stop pacemaker on 2-node cluster with SBD and failed devices?

Tue Jun 15 13:48:04 EDT 2021

I'm using 'pcs cluster stop' (or it's crm alternative),yet I'm not sure if it will help in this case.

Most probably the safest way is to wait for the storage to be recovered, as without the pacemaker<->SBD communication , sbd will stop and the watchdog will be triggered.

Best Regards,
Strahil Nikolov

В вторник, 15 юни 2021 г., 18:47:06 ч. Гринуич+3, Andrei Borzenkov <arvidjaar at gmail.com> написа: 

On Tue, Jun 15, 2021 at 6:43 PM Strahil Nikolov <hunter86_bg at yahoo.com> wrote:
>
> How did you stop pacemaker ?

systemctl stop pacemaker

surprise :)

> Usually I use 'pcs cluster stop' or it's crm alternative.
>
> Best Regards,
> Strahil Nikolov
>
> On Tue, Jun 15, 2021 at 18:21, Andrei Borzenkov
> <arvidjaar at gmail.com> wrote:
> We had the following situation
>
> 2-node cluster with single device (just single external storage
> available). Storage failed. So SBD lost access to the device. Cluster
> was still up, both nodes were running.
>
> We thought that access to storage was restored, but one step was
> missing so devices appeared empty.
>
> At this point I tried to restart the pacemaker. But as soon as I
> stopped pacemaker SBD rebooted nodes - which is logical, as quorum was
> now lost.
>
> How to cleanly stop pacemaker in this case and keep nodes up?
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/