[ClusterLabs] Help with PostgreSQL Automatic Failover demotion

Fri Feb 18 18:59:31 EST 2022

Also,there is a way to tell the cluster to cleanup failures -> failure-timeout 
Best Regards,Strahil Nikolov

  On Sat, Feb 19, 2022 at 1:52, Jehan-Guillaume de Rorthais<jgdr at dalibo.com> wrote:   Hello,

On Fri, 18 Feb 2022 21:44:58 +0000
"Larry G. Mills" <lgmills at fnal.gov> wrote:

> ... This happened again recently, and the running primary DB was demoted and
> then re-promoted to be the running primary. What I'm having trouble
> understanding is why the running Master/primary DB was demoted.  After the
> monitor operation timed out, the failcount for the ha-db resource was still
> less than the configured "migration-threshold", which is set to 5.

Because "migration-threshold" is the limit before the resource is moved away
from the node.

As long as your failcount is less than "migration-threshold" and the failure
is not fatal, the cluster will keep the resource on the same node and try to
"recover" it by running a full restart: demote -> stop -> start -> promote.

Since 2.0, the recover action can be demote -> promote. See the "on-fail"
property and the detail about it below the table:

https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/singlehtml/index.html#operation-properties

Regards,
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20220218/99d76675/attachment.htm>