[Pacemaker] Reinstall Pacemaker/Corosync.

Mon Nov 30 17:30:47 EST 2015

On 11/24/2015 04:28 AM, emmanuel segura wrote:
> I don't remember well, But I think in Redhat 6.5 you need to use
> cman+pacemaker and please your config and you need to be sure you have
> fencing configured.

Yes, the versions in 6.5 are quite old; 6.7 has recent versions, so if
you can upgrade, that would help. Even 6.6 is significantly newer and
has important bugfixes.

RHEL 6 does use corosync 1, but via CMAN rather than directly.

You can use the pcs command to configure and deconfigure the cluster
(pcs cluster node add/remove for one node, or pcs cluster setup/destroy
for the entire cluster).

> 2015-11-24 11:18 GMT+01:00 Cayab, Jefrey E. <jcayab at gmail.com>:
>> Hi all,
>>
>> I searched online but couldn't find a detailed answer. OS is RHEL 6.5.
>>
>> Problem:
>> I have 2 servers which was setup fine (MySQL cluster is on it, DRBD for the
>> data disk on local disk) on which these 2 servers needs to be migrated to
>> other location. When it was migrated, the DRBD has to change from local disk
>> to SAN LUN which was migrated ok but the cluster began experiencing weird
>> behavior. Then the 2 nodes are shutdown and booted together, each server can
>> see each other as online via "crm_mon -1" but when one of the node's
>> pacemaker process is restarted, the status of that node from the other node
>> stays offline/stopped, even if I reboot that node, it doesn't join back the
>> cluster.
>>
>> Other observation - if these 2 servers boot up together, both see online as
>> above and when I stop pacemaker process on the Active node, the other node
>> takes over the resources which is good but even if I start back the
>> pacemaker process on the other node, it's not able to take back the
>> resources. Kind of like, only one failover can happen and cannot failback.
>>
>>
>> What I did:
>> I removed Pacemaker and Corosync via YUM
>> Rebooted the OS
>> Verified no more Pacemaker/Corosync packages
>> Installed back Pacemaker and Corosync via YUM
>> When I did "crm_mon -1", I'm surprised to see that configuration is still
>> there.
>>
>> After the reinstallation, still experiencing the same behavior and noticed
>> that DRBD is reporting Failed disk - only a reboot of the node can bring it
>> back to UpToDate.
>>
>> Please advise on the correct procedure to wipe out the configuration and
>> reinstallation.
>>
>> I will share the logs shortly.
>>
>> Thanks,
>> Jef