[ClusterLabs] issue during Pacemaker failover testing

Wed Aug 30 12:23:21 EDT 2023

>
> >
> > Hi All,
> >
> > I'm running Pacemaker on Centos7
> > Name        : pcs
> > Version     : 0.9.169
> > Release     : 3.el7.centos.3
> > Architecture: x86_64
> >
> >
> > I'm performing some cluster failover tests in a 3 node cluster. We have
> 3 resources in the cluster.
> > I was trying to see if I could get it working if 2 nodes fail at
> different times. I'd like the 3 resources to then run on one node.
> >
> > The quorum options I've configured are as follows
> > [root at node1 ~]# pcs quorum config
> > Options:
> >   auto_tie_breaker: 1
> >   last_man_standing: 1
> >   last_man_standing_window: 10000
> >   wait_for_all: 1
> >
> > [root at node1 ~]# pcs quorum status
> > Quorum information
> > ------------------
> > Date:             Wed Aug 30 11:20:04 2023
> > Quorum provider:  corosync_votequorum
> > Nodes:            3
> > Node ID:          1
> > Ring ID:          1/1538
> > Quorate:          Yes
> >
> > Votequorum information
> > ----------------------
> > Expected votes:   3
> > Highest expected: 3
> > Total votes:      3
> > Quorum:           2
> > Flags:            Quorate WaitForAll LastManStanding AutoTieBreaker
> >
> > Membership information
> > ----------------------
> >     Nodeid      Votes    Qdevice Name
> >          1          1         NR node1 (local)
> >          2          1         NR node2
> >          3          1         NR node3
> >
> > If I stop the cluster services on node 2 and 3, the groups all failover
> to node 1 since it is the node with the lowest ID
> > But if I stop them on node1 and node 2 or node1 and node3, the cluster
> fails.
> >
> > I tried adding this line to corosync.conf and I could then bring down
> the services on node 1 and 2 or node 2 and 3 but if I left node 2 until
> last, the cluster failed
> > auto_tie_breaker_node: 1  3
> >
> > This line had the same outcome as using 1 3
> > auto_tie_breaker_node: 1  2 3
> >
> > So I'd like it to failover when any combination of two nodes fail but
> I've only had success when the middle node isn't last.
> >
>
> Use fencing. Quorum is not a replacement for fencing. With (reliable)
> fencing you can simply run pacemaker with no-quorum-policy=ignore.
>
> The practical problem is that usually the last resort that will work
> in all cases is SBD + suicide and SBD cannot work without quorum.
>
> Ah I forgot to mention I do have fencing setup, which connects to Vmware
Virtualcenter.
Do you think it's safe to set that no-quorum-policy=ignore?
Thanks
David

>
>
> **************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20230830/29c329c6/attachment.htm>