[ClusterLabs] Antw: Re: Antw: After reboot each node thinks the other is offline.

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Tue Aug 1 05:33:14 EDT 2017


>>> "Stephen Carville (HA List)" <62d2a7ca at opayq.com> schrieb am 01.08.2017 um
10:05 in Nachricht <dbd30668-a50a-f910-dd13-69274831f19f at opayq.com>:
> On 07/31/2017 11:13 PM, Ulrich Windl [Masked] wrote:
> 
>>> I am experimenting with pacemaker for high availability for some load
>>> balancers.  I was able to sucessfully get two CentOS (6.9) machines
>>> (scahadev01da and scahadev01db) to form a cluster and the shared IP was
>>> assigned to scahadev01da.  I simulated a failure by halting the primary
>>> and the secondary eventually noticed bringing up the shared IP on its
>>> eth0.  So far, so good.
>>>
>>> A problem arises when the primary comes back up and, for some reason,
>>> each node thinks the other is offline.  This leads to both nodes adding
>> 
>> If a node thinks the other is unexpectedly offline, it will fence it, and 
> then it will be offline! Thus the IP can't run there. I guess you have no 
> fencing configured, right?
> 
> No. I didn't realize it was necessary unless there was shared storage
> involved.  I guess it is time to go back to the drawing board.  Can
> clustering even be done reliably on CentOS 6?  I have no objection to
> moving to 7 but I was hoping I could get this up quicker than building
> out a bunch of new balancers.
> 
> On a related note: I tried rebooting both nodes and each node still
> thinks the other is offline.  For future reference is there a way to
> clear that?

If you start both nodes (and wait for a while), both nodes should appear as online (on each node). If it does not happen, there may be some communication or configuration problem. Before investing much time on the old version, I'd go forward to the current OS (personal preference)...

Regards,
Ulrich

> 
>> Regards,
>> Ulrich
>> 
>>> the duplicate IP to its own eth0.  I probably do not need to tell you
>>> the mischief that can cause if these were production servers.
>>>
>>> I tried restarting cman, pcsd and pacemaker on both machines with no
>>> effect on the situation.
>>>
>>> I've found several mentions of it in the search engines but I've been
>>> unable to find how to fix it.  Any help is appreciated
>>>
>>> Both nodes have quorum disabled in /etc/sysconfig/cman
>>>
>>> CMAN_QUORUM_TIMEOUT=0
>>>
>>> #------------------------------------------------
>>> Node 1
>>>
>>> scahadev01da# sudo pcs status
>>> Cluster name: scahadev01d
>>> Stack: cman
>>> Current DC: scahadev01da (version 1.1.15-5.el6-e174ec8) - partition
>>> WITHOUT quorum
>>> Last updated: Mon Jul 31 10:43:54 2017		Last change: Mon Jul 31 10:30:46
>>> 2017 by root via cibadmin on scahadev01da
>>>
>>> 2 nodes and 1 resource configured
>>>
>>> Online: [ scahadev01da ]
>>> OFFLINE: [ scahadev01db ]
>>>
>>> Full list of resources:
>>>
>>>  VirtualIP	(ocf::heartbeat:IPaddr2):	Started scahadev01da
>>>
>>> Daemon Status:
>>>   cman: active/enabled
>>>   corosync: active/disabled
>>>   pacemaker: active/enabled
>>>   pcsd: active/enabled
>>>
>>> #------------------------------------------------
>>> Node 2
>>>
>>> scahadev01db ~]$ sudo pcs status
>>> Cluster name: scahadev01d
>>> Stack: cman
>>> Current DC: scahadev01db (version 1.1.15-5.el6-e174ec8) - partition
>>> WITHOUT quorum
>>> Last updated: Mon Jul 31 10:43:47 2017		Last change: Sat Jul 29 13:45:15
>>> 2017 by root via cibadmin on scahadev01da
>>>
>>> 2 nodes and 1 resource configured
>>>
>>> Online: [ scahadev01db ]
>>> OFFLINE: [ scahadev01da ]
>>>
>>> Full list of resources:
>>>
>>>  VirtualIP	(ocf::heartbeat:IPaddr2):	Started scahadev01db
>>>
>>> Daemon Status:
>>>   cman: active/enabled
>>>   corosync: active/disabled
>>>   pacemaker: active/enabled
>>>   pcsd: active/enabled
>>>
>>> --
>>> Stephen Carville
>>>
>>> _______________________________________________
>>> Users mailing list: Users at clusterlabs.org 
>>> http://lists.clusterlabs.org/mailman/listinfo/users 
>>>
>>> Project Home: http://www.clusterlabs.org 
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>>> Bugs: http://bugs.clusterlabs.org 
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org 
>> http://lists.clusterlabs.org/mailman/listinfo/users 
>> 
>> Project Home: http://www.clusterlabs.org 
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>> Bugs: http://bugs.clusterlabs.org 
>> 
> 
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org 
> http://lists.clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org 







More information about the Users mailing list