[ClusterLabs] Antw: Failover due to intermittent network partition.

Fri Aug 11 07:49:55 UTC 2017

>>> neeraj ch <nwarriorch at gmail.com> schrieb am 10.08.2017 um 00:04 in Nachricht
<CAKvLDn6jUVtWUQjVw-hPm7JCDU+MwUPpgbOSfFxRgsGdWBpkww at mail.gmail.com>:
> Hello,
> 
> I have a three node pacemaker cluster running using the Master Slave
> primitive. Its an asymmetric cluster with one node running as master and
> the other one as slave, the third one serving as a quoram node.
> 
> Our DC has frequent network issues. One thing, in particular, is the master
> partition that happens occasionally. The partition usually occurs for 5-6
> seconds but still, the resource on the node is stopped and a new master
> election happens.
> 
> Is there a way to delay the election say 10 - 15 seconds before considering
> quoram loss ?

Hi!

Good question, but unfortunately I don't know the answer.
When I started with pacemaker I thought there would be such a setting (ignore any network problems less than x seconds at the cluster level (RAs may do something different)), but didn't fnd one.
Here in out environment, reacting to network problems shorter than about 30 seconds makes no sense at all, because any resource operation will definitely exceed that time anyway...

Regards,
Ulrich

> 
> Of reference, I am using pacemaker 1.14 with corosync.
> 
> 
> Thank you