[ClusterLabs] data loss of network would cause Pacemaker exit abnormally
kgaillot at redhat.com
Wed Aug 31 11:39:52 EDT 2016
On 08/30/2016 01:58 PM, chenhj wrote:
> This is a continuation of the email below(I did not subscrib this maillist)
>>>From the above, I suspect that the node with the network loss was the
>>DC, and from its point of view, it was the other node that went away.
> Yes. the node with the network loss was DC(node2)
> Could someone explain what's the following messges means, and
> why pacemakerd process exit instead of rejoin to CPG group?
>> Aug 27 12:33:59  node3 pacemakerd: error: pcmk_cpg_membership:
>> We're not part of CPG group 'pacemakerd' anymore!
This means the node was kicked out of the membership. I don't remember
what that implies, I'm guessing the node exits because the cluster will
most likely fence it after kicking it out.
>>> [root at node3 ~]# rpm -q corosync
>>That is quite old ...
>>> [root at node3 ~]# cat /etc/redhat-release
>>> CentOS release 6.3 (Final)
>>> [root at node3 ~]# pacemakerd -F
>> Pacemaker 1.1.14-1.el6 (Build: 70404b0)
>>and I doubt that many people have tested Pacemaker 1.1.14 against
>>corosync 1.4.1 ... quite far away from
>>each other release-wise ...
> pacemaker 1.1.14 + corosync-1.4.7 can also reproduced this probleam, but
> seems with lower probability.
The corosync 2 series is a major improvement, but some config changes
More information about the Users