[ClusterLabs] issue during Pacemaker failover testing

Andrei Borzenkov arvidjaar at gmail.com
Wed Aug 30 08:43:17 EDT 2023


On Wed, Aug 30, 2023 at 3:34 PM David Dolan <daithidolan at gmail.com> wrote:
>
> Hi All,
>
> I'm running Pacemaker on Centos7
> Name        : pcs
> Version     : 0.9.169
> Release     : 3.el7.centos.3
> Architecture: x86_64
>
>
> I'm performing some cluster failover tests in a 3 node cluster. We have 3 resources in the cluster.
> I was trying to see if I could get it working if 2 nodes fail at different times. I'd like the 3 resources to then run on one node.
>
> The quorum options I've configured are as follows
> [root at node1 ~]# pcs quorum config
> Options:
>   auto_tie_breaker: 1
>   last_man_standing: 1
>   last_man_standing_window: 10000
>   wait_for_all: 1
>
> [root at node1 ~]# pcs quorum status
> Quorum information
> ------------------
> Date:             Wed Aug 30 11:20:04 2023
> Quorum provider:  corosync_votequorum
> Nodes:            3
> Node ID:          1
> Ring ID:          1/1538
> Quorate:          Yes
>
> Votequorum information
> ----------------------
> Expected votes:   3
> Highest expected: 3
> Total votes:      3
> Quorum:           2
> Flags:            Quorate WaitForAll LastManStanding AutoTieBreaker
>
> Membership information
> ----------------------
>     Nodeid      Votes    Qdevice Name
>          1          1         NR node1 (local)
>          2          1         NR node2
>          3          1         NR node3
>
> If I stop the cluster services on node 2 and 3, the groups all failover to node 1 since it is the node with the lowest ID
> But if I stop them on node1 and node 2 or node1 and node3, the cluster fails.
>
> I tried adding this line to corosync.conf and I could then bring down the services on node 1 and 2 or node 2 and 3 but if I left node 2 until last, the cluster failed
> auto_tie_breaker_node: 1  3
>
> This line had the same outcome as using 1 3
> auto_tie_breaker_node: 1  2 3
>
> So I'd like it to failover when any combination of two nodes fail but I've only had success when the middle node isn't last.
>

Use fencing. Quorum is not a replacement for fencing. With (reliable)
fencing you can simply run pacemaker with no-quorum-policy=ignore.

The practical problem is that usually the last resort that will work
in all cases is SBD + suicide and SBD cannot work without quorum.


More information about the Users mailing list