[ClusterLabs] Why Do All The Services Go Down When Just One Fails?

Eric Robinson eric.robinson at psmnv.com
Sat Feb 16 21:33:42 UTC 2019


> -----Original Message-----
> From: Users <users-bounces at clusterlabs.org> On Behalf Of Valentin Vidic
> Sent: Saturday, February 16, 2019 1:28 PM
> To: users at clusterlabs.org
> Subject: Re: [ClusterLabs] Why Do All The Services Go Down When Just One
> Fails?
> 
> On Sat, Feb 16, 2019 at 09:03:43PM +0000, Eric Robinson wrote:
> > Here are the relevant corosync logs.
> >
> > It appears that the stop action for resource p_mysql_002 failed, and
> > that caused a cascading series of service changes. However, I don't
> > understand why, since no other resources are dependent on p_mysql_002.
> 
> The stop failed because of a timeout (15s), so you can try to update that
> value:
> 


I just noticed that. I also noticed that the lsb init script has a hard-coded stop timeout of 30 seconds. So if the init script waits longer than the cluster resource timeout of 15s, that would cause the resource to fail. However, I don't want cluster failover to be triggered by the failure of one of the MySQL resources. I only want cluster failover to occur if the filesystem or drbd resources fail, or if the cluster messaging layer detects a complete node failure. Is there a way to tell PaceMaker not to trigger cluster failover if any of the p_mysql resources fail?  


>   Result of stop operation for p_mysql_002 on 001db01a: Timed Out |
> call=1094 key=p_mysql_002_stop_0 timeout=15000ms
> 
> After the stop failed it should have fenced that node, but you don't have
> fencing configured so it tries to move mysql_002 and all the other resources
> related to it (vip, fs, drbd) to the other node.
> Since other mysql resources depend on the same (vip, fs, drbd) they need to
> be stopped first.
> 
> --
> Valentin
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org


More information about the Users mailing list