[ClusterLabs] Merging partitioned two_node cluster?

Tue May 5 03:26:03 EDT 2020

> On May 5, 2020 6:39:54 AM GMT+03:00, "Nickle, Richard" <rnickle at holycross.edu> wrote:
>> I have a two node cluster managing a VIP.  The service is an SMTP
>> service.
>> This could be active/active, it doesn't matter which node accepts the
>> SMTP
>> connection, but I wanted to make sure that a VIP was in place so that
>> there
>> was a well-known address.
>>
>> This service has been running for quite awhile with no problems.  All
>> of a
>> sudden, it partitioned, and now I can't work out a good way to get them
>> to
>> merge the clusters back again.  Right now one partition takes the
>> resource
>> and starts the VIP, but doesn't see the other node.  The other node
>> doesn't
>> create a resource, and can't seem to see the other node.
>>
>> At this point, I am perfectly willing to create another node and make
>> an
>> odd-numbered cluster, the arguments for this being fairly persuasive.
>> But
>> I'm not sure why they are blocking.
>>
>> Surely there must be some manual way to get a partitioned cluster to
>> merge?  Some trick?  I also had a scenario several weeks ago where an
>> odd-numbered cluster configured in a similar way partitioned into a 3
>> and 2
>> node cluster, and I was unable to work out how to get them to merge,
>> until
>> all of a sudden they seemed to fix themselves after doing a 'pcs node
>> remove/pcs node add' which had failed many times before.  I have tried
>> that
>> here but with no success so far.
>>
>> I ruled out some common cases I've seen in discussions and threads,
>> such as
>> having my host name defined in host as localhost, etc.
>>
>> Corosync 2.4.3, Pacemaker 0.9.164. (Ubuntu 18.04.).
>>
>> Output from pcs status for both nodes:
>>
>> Cluster name: mail
>> Stack: corosync
>> Current DC: mail2 (version 1.1.18-2b07d5c5a9) - partition with quorum
>> Last updated: Mon May  4 23:28:53 2020
>> Last change: Mon May  4 21:50:04 2020 by hacluster via crmd on mail2
>>
>> 2 nodes configured
>> 1 resource configured
>>
>> Online: [ mail2 ]
>> OFFLINE: [ mail3 ]
>>
>> Full list of resources:
>>
>> mail_vip (ocf::heartbeat:IPaddr2): Started mail2
>>
>> Daemon Status:
>>   corosync: active/enabled
>>   pacemaker: active/enabled
>>   pcsd: active/enabled
>>
>> Cluster name: mail
>> Stack: corosync
>> Current DC: mail3 (version 1.1.18-2b07d5c5a9) - partition with quorum
>> Last updated: Mon May  4 22:13:10 2020
>> Last change: Mon May  4 22:10:34 2020 by root via cibadmin on mail3
>>
>> 2 nodes configured
>> 0 resources configured
>>
>> Online: [ mail3 ]
>> OFFLINE: [ mail2 ]
>>
>> No resources
>>
>> Daemon Status:
>>   corosync: active/enabled
>>   pacemaker: active/enabled
>>   pcsd: active/enabled
>>
>> /etc/corosync/corosync.conf:
>>
>> totem {
>>     version: 2
>>     cluster_name: mail
>>     clear_node_high_bit: yes
>>     crypto_cipher: none
>>     crypto_hash: none
>>
>>     interface {
>>         ringnumber: 0
>>         bindnetaddr: 192.168.80.128
>>         mcastport: 5405
>>     }
>> }
>>
>> logging {
>>     fileline: off
>>     to_stderr: no
>>     to_logfile: no
>>     to_syslog: yes
>>     syslog_facility: daemon
>>     debug: off
>>     timestamp: on
>> }
>>
>> quorum {
>>     provider: corosync_votequorum
>>     wait_for_all: 0
>>     two_node: 1
>> }
>>
>> nodelist {
>>     node {
>>         ring0_addr: mail2
>>         name: mail2
>>         nodeid: 1
>>     }
>>
>>     node {
>>         ring0_addr: mail3
>>         name: mail3
>>         nodeid: 2
>>     }
>> }
>>
>> Thanks!
>>
>> Rick
> 
> Ah Rick,All
> 
> Just ignore the previous one - I guess  I'm too sleepy.

Honestly I think your advise was good. Current config uses default 
transport and for 2.4.3 it means multicast so trying unicast udpu may 
solve the problem.

If not I would take a look to classic things like firewall, ...

Regards,
   Honza

> 
> 
> Best Regards,
> Strahil Nikolov
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
>