[ClusterLabs] Why Do All The Services Go Down When Just One Fails?

Sat Feb 16 21:42:05 UTC 2019

On Sat, Feb 16, 2019 at 09:33:42PM +0000, Eric Robinson wrote:
> I just noticed that. I also noticed that the lsb init script has a
> hard-coded stop timeout of 30 seconds. So if the init script waits
> longer than the cluster resource timeout of 15s, that would cause the

Yes, you should use higher timeouts in pacemaker (45s for example).

> resource to fail. However, I don't want cluster failover to be
> triggered by the failure of one of the MySQL resources. I only want
> cluster failover to occur if the filesystem or drbd resources fail, or
> if the cluster messaging layer detects a complete node failure. Is
> there a way to tell PaceMaker not to trigger cluster failover if any
> of the p_mysql resources fail?  

You can try playing with the on-fail option but I'm not sure how
reliably this whole setup will work without some form of fencing/stonith.

https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/_resource_operations.html

-- 
Valentin