[ClusterLabs] Quorum when reducing cluster from 3 nodes to 2 nodes

Hayden,Robert RHAYDEN at CERNER.COM
Tue Jun 8 17:49:02 EDT 2021



> -----Original Message-----
> From: Users <users-bounces at clusterlabs.org> On Behalf Of Tomas Jelinek
> Sent: Tuesday, June 1, 2021 3:40 AM
> To: users at clusterlabs.org
> Subject: Re: [ClusterLabs] Quorum when reducing cluster from 3 nodes to 2
> nodes
>
> Hi Robert,
>
> Your corosync.conf looks fine to me, node app3 has been removed and
> two_node flag has been set properly. Try running 'pcs cluster reload
> corosync' and then check the quorum status. If it doesn't get fixed,
> take a look at /var/log/cluster/corosync.log to see if there are any
> issues reported.
>

Wanted to report back that the "pcs cluster reload corosync" worked on getting proper quorum votes.   It was an uptime command without impact to the resources.
Thanks for the help!
Robert


> Regards,
> Tomas
>
>
> Dne 31. 05. 21 v 20:55 Hayden,Robert napsal(a):
> >
> >
> >> -----Original Message-----
> >> From: Users <users-bounces at clusterlabs.org> On Behalf Of Tomas
> Jelinek
> >> Sent: Monday, May 31, 2021 6:29 AM
> >> To: users at clusterlabs.org
> >> Subject: Re: [ClusterLabs] Quorum when reducing cluster from 3 nodes to
> 2
> >> nodes
> >>
> >> Hi Robert,
> >>
> >> Can you share your /etc/corosync/corosync.conf file? Also check if it's
> >> the same on all nodes.
> >>
> >
> > I verified that the corosync.conf file is the same across the nodes.   As part
> of the troubleshooting, I manually ran the command "crm_node --
> remove=app3 --force" to remove the third node from the corosync
> configuration.   My concern is around why the quorum number did not auto
> downgrade to a value of "1", especially since we run with the
> "last_man_standing" flag.    I suspect the issue is in the two-node special
> case.  That is, if I was removing a node from a 4+ node cluster, I would not
> have had an issue.
> >
> > Here is the information you requested, slightly redacted for security.
> >
> > root:@app1:/root
> > #20:45:02 # cat /etc/corosync/corosync.conf
> > totem {
> >      version: 2
> >      cluster_name: XXXX_app_2
> >      secauth: off
> >      transport: udpu
> >      token: 61000
> > }
> >
> > nodelist {
> >      node {
> >          ring0_addr: app1
> >          nodeid: 1
> >      }
> >
> >      node {
> >          ring0_addr: app2
> >          nodeid: 3
> >      }
> > }
> >
> > quorum {
> >      provider: corosync_votequorum
> >      wait_for_all: 1
> >      last_man_standing: 1
> >      two_node: 1
> > }
> >
> > logging {
> >      to_logfile: yes
> >      logfile: /var/log/cluster/corosync.log
> >      to_syslog: yes
> > }
> > root:@app1:/root
> > #20:45:12 # ssh app2 md5sum /etc/corosync/corosync.conf
> > d69b80cd821ff44224b56ae71c5d731c  /etc/corosync/corosync.conf
> > root:@app1:/root
> > #20:45:30 # md5sum /etc/corosync/corosync.conf
> > d69b80cd821ff44224b56ae71c5d731c  /etc/corosync/corosync.conf
> >
> > Thanks
> > Robert
> >
> >> Dne 26. 05. 21 v 17:48 Hayden,Robert napsal(a):
> >>> I had a SysAdmin reduce the number of nodes in a OL 7.9 cluster from
> >>> three nodes to two nodes.
> >>>
> >>>   From internal testing, I found the following commands would work and
> >>> the 2Node attribute would be automatically added.  The other cluster
> >>> parameters we use are WaitForAll and LastManStanding.
> >>>
> >>> pcs resource disable res_app03
> >>>
> >>> pcs resource delete res_app03
> >>>
> >>> pcs cluster node remove app03
> >>>
> >>> pcs stonith delete fence_app03
> >>>
> >>> Unfortunately, real world didn't go as planned.   I am unsure if the
> >>> commands were ran out of order or something else was going on (e.g.
> >>> unexpected location constraints).   When I got involved, I noticed that
> >>> pcs status had the app3 node in an OFFLINE state, but the pcs cluster
> >>> node remove app03 command was successful.   I noticed some leftover
> >>> location constraints from past "moves" of resources.  I manually
> removed
> >>> those constraints and I ended up removing the app03 node from the
> >>> corosync configuration with "crm_node --remove=app3 --force"
> command.
> >>
> >> This removes the node from pacemaker config, not from corosync config.
> >>
> >> Regards,
> >> Tomas
> >>
> >>>      Now pcs status no longer shows any information for app3 and
> crm_node
> >>> -l does not show app3.
> >>>
> >>> My concern is with Quorum.   From the pcs quorum status output below,
> I
> >>> still see Quorum set at 2 (expected to be 1) and the 2Node attribute was
> >>> not added.   Am I stuck in this state until the next full cluster
> >>> downtime?  Or is there a way to manipulate the expected quorum votes
> in
> >>> the run time cluster?
> >>>
> >>> #17:25:08 # pcs quorum status
> >>>
> >>> Quorum information
> >>>
> >>> ------------------
> >>>
> >>> Date:             Wed May 26 17:25:16 2021
> >>>
> >>> Quorum provider:  corosync_votequorum
> >>>
> >>> Nodes:            2
> >>>
> >>> Node ID:          3
> >>>
> >>> Ring ID:          1/85
> >>>
> >>> Quorate:          Yes
> >>>
> >>> Votequorum information
> >>>
> >>> ----------------------
> >>>
> >>> Expected votes:   2
> >>>
> >>> Highest expected: 2
> >>>
> >>> Total votes:      2
> >>>
> >>> Quorum:           2
> >>>
> >>> Flags:            Quorate WaitForAll LastManStanding
> >>>
> >>> Membership information
> >>>
> >>> ----------------------
> >>>
> >>>       Nodeid      Votes    Qdevice Name
> >>>
> >>>            1          1         NR app1
> >>>
> >>>            3          1         NR app2 (local)
> >>>
> >
> >
> > CONFIDENTIALITY NOTICE This message and any included attachments are
> from Cerner Corporation and are intended only for the addressee. The
> information contained in this message is confidential and may constitute
> inside or non-public information under international, federal, or state
> securities laws. Unauthorized forwarding, printing, copying, distribution, or
> use of such information is strictly prohibited and may be unlawful. If you are
> not the addressee, please promptly delete this message and notify the
> sender of the delivery error by e-mail or you may call Cerner's corporate
> offices in Kansas City, Missouri, U.S.A at (+1) (816)221-1024.
> > _______________________________________________
> > Manage your subscription:
> >
> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.
> clusterlabs.org%2Fmailman%2Flistinfo%2Fusers&data=04%7C01%7Crha
> yden%40cerner.com%7C939be817405545cdef3608d924d8cf34%7Cfbc493a80
> d244454a815f4ca58e8c09d%7C0%7C0%7C637581336325633810%7CUnknown
> %7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1ha
> WwiLCJXVCI6Mn0%3D%7C3000&sdata=cLl5aj1%2Fr5nCVVM4hzHkpYaS
> %2ByXpNHPBmkKoDHzyTe8%3D&reserved=0
> >
> > ClusterLabs home:
> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fww
> w.clusterlabs.org%2F&data=04%7C01%7Crhayden%40cerner.com%7C9
> 39be817405545cdef3608d924d8cf34%7Cfbc493a80d244454a815f4ca58e8c09d
> %7C0%7C0%7C637581336325633810%7CUnknown%7CTWFpbGZsb3d8eyJWIj
> oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3
> 000&sdata=N7y3rEDVjPJhfB68Ep69TliGED%2FyiX6sTCFGzXuUHKc%3D&
> amp;reserved=0
> >
>
> _______________________________________________
> Manage your subscription:
> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.
> clusterlabs.org%2Fmailman%2Flistinfo%2Fusers&data=04%7C01%7Crha
> yden%40cerner.com%7C939be817405545cdef3608d924d8cf34%7Cfbc493a80
> d244454a815f4ca58e8c09d%7C0%7C0%7C637581336325633810%7CUnknown
> %7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1ha
> WwiLCJXVCI6Mn0%3D%7C3000&sdata=cLl5aj1%2Fr5nCVVM4hzHkpYaS
> %2ByXpNHPBmkKoDHzyTe8%3D&reserved=0
>
> ClusterLabs home:
> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fww
> w.clusterlabs.org%2F&data=04%7C01%7Crhayden%40cerner.com%7C9
> 39be817405545cdef3608d924d8cf34%7Cfbc493a80d244454a815f4ca58e8c09d
> %7C0%7C0%7C637581336325633810%7CUnknown%7CTWFpbGZsb3d8eyJWIj
> oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3
> 000&sdata=N7y3rEDVjPJhfB68Ep69TliGED%2FyiX6sTCFGzXuUHKc%3D&
> amp;reserved=0


CONFIDENTIALITY NOTICE This message and any included attachments are from Cerner Corporation and are intended only for the addressee. The information contained in this message is confidential and may constitute inside or non-public information under international, federal, or state securities laws. Unauthorized forwarding, printing, copying, distribution, or use of such information is strictly prohibited and may be unlawful. If you are not the addressee, please promptly delete this message and notify the sender of the delivery error by e-mail or you may call Cerner's corporate offices in Kansas City, Missouri, U.S.A at (+1) (816)221-1024.


More information about the Users mailing list