[Pacemaker] corosync restarts service when slave node joins the cluster

Andrew Beekhof andrew at beekhof.net
Wed May 1 18:44:25 EDT 2013


On 01/05/2013, at 11:55 PM, Babu Challa <Babu.Challa at ipaccess.com> wrote:

> Hi Andrew, 
> 
> Thanks for the replay. Now I have managed to reproduce the issue. I am enclosing details here for pacemaker team for their understanding . Requesting their advice for resolving this issue

Update your software.

> 
> Hello Pacemaker team, 
> 
> This is a 4 node HA cluster where each pair of nodes are configured  for DB and file system replication We have very tricky situation. We have configured two clusters with exact same configuration on each. But on one cluster,  corosync restarting the services when slave node is rebooted and re-joins the cluster.
> 
> I have managed to reproduce the issue on another cluster with following steps
> 
> 1.	Reboot master, confirm switchover
> 2.	Wait till old master re-join cluster as slave
> 3.	Now On old  master (current slave) ...
> 4.	stop corosync
> 5.	Bring down bond2 (bond2 is configured for intercommunication between the nodes)
> 6.	Bring up bond2
> 7.	Start corosync
> 8.	When slave joining the cluster, I can see Service stop/start on  master 
> 9.	Issue doesn't appear if we reboot other node when it is slave. 
> 10.	So bottom line is , service manager is restarting only when configured Master becomes slave and when perform above steps  on that server.
> 
> Can you please advise if this issue can be resolved by upgrading the newer version of pacemaker/corosync. If possible can you send me change log of particular version where the issue has been fixed
> 
> Versions we are using;
> 
> Pacemaker version - pacemaker-1.1.5 
> Corosync version - corosync-1.2.7
> heartbeat-3.0.3-2.3
> 
> R
> Babu Challa 
> T: +44 (0) 1954 717972 | M: +44 (0) 7912 859958| E: babu.challa at ipaccess.com | W: www.ipaccess.com
> ip.access Ltd, Building 2020, Cambourne Business Park, Cambourne, Cambridge, CB23 6DW
> 
> The desire to excel is exclusive of the fact whether someone else appreciates it or not. "Excellence" is a drive from inside, not outside. Excellence is not for someone else to notice but for your own satisfaction and efficiency...
> 
> 
> -----Original Message-----
> From: Andrew Beekhof [mailto:andrew at beekhof.net] 
> Sent: 01 May 2013 01:20
> To: Babu Challa
> Cc: The Pacemaker cluster resource manager
> Subject: Re: corosync restarts service when slave node joins the cluster
> 
> Please ask questions on the mailing lists.
> 
> On 01/05/2013, at 12:30 AM, Babu Challa <Babu.Challa at ipaccess.com> wrote:
> 
>> Hi Andrew,
>> 
>> Greetings,
>> 
>> We are using corosync/pacemaker for  high availability
>> 
>> This is a 4 node HA cluster where each pair of nodes are configured 
>> for DB and file system replication We have very tricky situation. We have configured two clusters with exact same configuration on each. But on one cluster,  corosync restarting the services when slave node is rebooted and re-joins the cluster.
>> 
>> We have tried to reproduce the issue on other cluster with multiple HA 
>> scenarios but no luck
>> 
>> Few questions:
>> 
>> 1.       If rebooted slave is a  DC (designated Controller) , is there any possibility of this issue
>> 2.       Is there any known issue in pacemaker version currently  we are using (1.1.5) which will be resolved if we upgrade to latest (1.8)
> 
> I believe there was one, check the ChangeLog
> 
>> 3.       Is there any chance that pacemaker/corosync behaves differently even though configuration is same on each cluster
> 
> Timing issues do occur, how identical is the hardware?
> 
>> 4.       Can you please let us kinow if there is any possible reason for this issue. That's really helpful to reproduce this issue and fix it
> 
> More than likely it has been fixed in a later version.
> 
>> 
>> Versions we are using;
>> 
>> Pacemaker version - pacemaker-1.1.5
>> Corosync version - corosync-1.2.7
>> heartbeat-3.0.3-2.3
>> 
>> R
>> Babu Challa
>> T: +44 (0) 1954 717972 | M: +44 (0) 7912 859958| E: 
>> babu.challa at ipaccess.com | W: www.ipaccess.com ip.access Ltd, Building 
>> 2020, Cambourne Business Park, Cambourne, Cambridge, CB23 6DW
>> 
>> The desire to excel is exclusive of the fact whether someone else appreciates it or not. "Excellence" is a drive from inside, not outside. Excellence is not for someone else to notice but for your own satisfaction and efficiency...
>> 
>> 
>> 
>> 
>> 
>> This message contains confidential information and may be privileged. If you are not the intended recipient, please notify the sender and delete the message immediately.
>> 
>> ip.access ltd, registration number 3400157, Building 2020, Cambourne 
>> Business Park, Cambourne, Cambridge CB23 6DW, United Kingdom
>> 
>> 
>> 
> 
> 
> 
> 
> 
> 
> This message contains confidential information and may be privileged. If you are not the intended recipient, please notify the sender and delete the message immediately.
> 
> ip.access Ltd, registration number 3400157, Building 2020,
> Cambourne Business Park, Cambourne, Cambridge CB23 6DW, United Kingdom





More information about the Pacemaker mailing list