[ClusterLabs] Merging partitioned two_node cluster?
Andrei Borzenkov
arvidjaar at gmail.com
Tue May 5 01:50:17 EDT 2020
05.05.2020 06:39, Nickle, Richard пишет:
> I have a two node cluster managing a VIP. The service is an SMTP service.
> This could be active/active, it doesn't matter which node accepts the SMTP
> connection, but I wanted to make sure that a VIP was in place so that there
> was a well-known address.
>
> This service has been running for quite awhile with no problems. All of a
> sudden, it partitioned, and now I can't work out a good way to get them to
> merge the clusters back again. Right now one partition takes the resource
> and starts the VIP, but doesn't see the other node. The other node doesn't
> create a resource, and can't seem to see the other node.
>
> At this point, I am perfectly willing to create another node and make an
> odd-numbered cluster, the arguments for this being fairly persuasive. But
> I'm not sure why they are blocking.
>
> Surely there must be some manual way to get a partitioned cluster to
> merge?
it does it automatically if nodes can communicate with each other. You
seem to have some network connectivity issues which you need to
investigate and resolve.
> Some trick? I also had a scenario several weeks ago where an
> odd-numbered cluster configured in a similar way partitioned into a 3 and 2
> node cluster, and I was unable to work out how to get them to merge, until
> all of a sudden they seemed to fix themselves after doing a 'pcs node
> remove/pcs node add' which had failed many times before. I have tried that
> here but with no success so far.
>
> I ruled out some common cases I've seen in discussions and threads, such as
> having my host name defined in host as localhost, etc.
>
> Corosync 2.4.3, Pacemaker 0.9.164. (Ubuntu 18.04.).
>
> Output from pcs status for both nodes:
>
> Cluster name: mail
> Stack: corosync
> Current DC: mail2 (version 1.1.18-2b07d5c5a9) - partition with quorum
> Last updated: Mon May 4 23:28:53 2020
> Last change: Mon May 4 21:50:04 2020 by hacluster via crmd on mail2
>
> 2 nodes configured
> 1 resource configured
>
> Online: [ mail2 ]
> OFFLINE: [ mail3 ]
>
> Full list of resources:
>
> mail_vip (ocf::heartbeat:IPaddr2): Started mail2
>
> Daemon Status:
> corosync: active/enabled
> pacemaker: active/enabled
> pcsd: active/enabled
>
> Cluster name: mail
> Stack: corosync
> Current DC: mail3 (version 1.1.18-2b07d5c5a9) - partition with quorum
> Last updated: Mon May 4 22:13:10 2020
> Last change: Mon May 4 22:10:34 2020 by root via cibadmin on mail3
>
> 2 nodes configured
> 0 resources configured
>
> Online: [ mail3 ]
> OFFLINE: [ mail2 ]
>
> No resources
>
> Daemon Status:
> corosync: active/enabled
> pacemaker: active/enabled
> pcsd: active/enabled
>
> /etc/corosync/corosync.conf:
>
> totem {
> version: 2
> cluster_name: mail
> clear_node_high_bit: yes
> crypto_cipher: none
> crypto_hash: none
>
> interface {
> ringnumber: 0
> bindnetaddr: 192.168.80.128
> mcastport: 5405
> }
> }
>
Is interconnect attached to LAN switches or it is direct cable between
two host?
> logging {
> fileline: off
> to_stderr: no
> to_logfile: no
> to_syslog: yes
> syslog_facility: daemon
> debug: off
> timestamp: on
> }
>
> quorum {
> provider: corosync_votequorum
> wait_for_all: 0
> two_node: 1
> }
>
> nodelist {
> node {
> ring0_addr: mail2
> name: mail2
> nodeid: 1
> }
>
> node {
> ring0_addr: mail3
> name: mail3
> nodeid: 2
> }
> }
>
> Thanks!
>
> Rick
>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
More information about the Users
mailing list