[ClusterLabs] Antw: After reboot each node thinks the other is offline.
Ulrich Windl
Ulrich.Windl at rz.uni-regensburg.de
Tue Aug 1 02:13:58 EDT 2017
>>> "Stephen Carville (HA List)" <62d2a7ca at opayq.com> schrieb am 31.07.2017 um
20:17 in Nachricht <d08c264a-6a84-b32b-049c-82d5ea929f3a at opayq.com>:
> I am experimenting with pacemaker for high availability for some load
> balancers. I was able to sucessfully get two CentOS (6.9) machines
> (scahadev01da and scahadev01db) to form a cluster and the shared IP was
> assigned to scahadev01da. I simulated a failure by halting the primary
> and the secondary eventually noticed bringing up the shared IP on its
> eth0. So far, so good.
>
> A problem arises when the primary comes back up and, for some reason,
> each node thinks the other is offline. This leads to both nodes adding
If a node thinks the other is unexpectedly offline, it will fence it, and then it will be offline! Thus the IP can't run there. I guess you have no fencing configured, right?
Regards,
Ulrich
> the duplicate IP to its own eth0. I probably do not need to tell you
> the mischief that can cause if these were production servers.
>
> I tried restarting cman, pcsd and pacemaker on both machines with no
> effect on the situation.
>
> I've found several mentions of it in the search engines but I've been
> unable to find how to fix it. Any help is appreciated
>
> Both nodes have quorum disabled in /etc/sysconfig/cman
>
> CMAN_QUORUM_TIMEOUT=0
>
> #------------------------------------------------
> Node 1
>
> scahadev01da# sudo pcs status
> Cluster name: scahadev01d
> Stack: cman
> Current DC: scahadev01da (version 1.1.15-5.el6-e174ec8) - partition
> WITHOUT quorum
> Last updated: Mon Jul 31 10:43:54 2017 Last change: Mon Jul 31 10:30:46
> 2017 by root via cibadmin on scahadev01da
>
> 2 nodes and 1 resource configured
>
> Online: [ scahadev01da ]
> OFFLINE: [ scahadev01db ]
>
> Full list of resources:
>
> VirtualIP (ocf::heartbeat:IPaddr2): Started scahadev01da
>
> Daemon Status:
> cman: active/enabled
> corosync: active/disabled
> pacemaker: active/enabled
> pcsd: active/enabled
>
> #------------------------------------------------
> Node 2
>
> scahadev01db ~]$ sudo pcs status
> Cluster name: scahadev01d
> Stack: cman
> Current DC: scahadev01db (version 1.1.15-5.el6-e174ec8) - partition
> WITHOUT quorum
> Last updated: Mon Jul 31 10:43:47 2017 Last change: Sat Jul 29 13:45:15
> 2017 by root via cibadmin on scahadev01da
>
> 2 nodes and 1 resource configured
>
> Online: [ scahadev01db ]
> OFFLINE: [ scahadev01da ]
>
> Full list of resources:
>
> VirtualIP (ocf::heartbeat:IPaddr2): Started scahadev01db
>
> Daemon Status:
> cman: active/enabled
> corosync: active/disabled
> pacemaker: active/enabled
> pcsd: active/enabled
>
> --
> Stephen Carville
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Users
mailing list