[ClusterLabs] Postgres Cluster PAF problems

Wed Jun 30 08:09:09 EDT 2021

Hi,

On Wed, 30 Jun 2021 13:44:28 +0200
damiano giuliani <damianogiuliani87 at gmail.com> wrote:

> looks some applications lost connection to the master losing some
> update/insert.
> 
> i found the cause into the logs, the psqld-monitor went timeout after
> 10000ms and the master resource been demote, the instance stopped and then
> promoted to master again, generating few seconds of disservices (no master
> during the described process)

This is the normal behaviour after a timeout.

I'm surprised you lost some insert/update though. Maybe this is related to your
PostgreSQL setup (synchronous_commit ? fsync ?)

> i noticed a redundant info:
> Update score of "ltaoperdbsXX" from 990 to 1000 because of a change in the
> replication lag
> seems some kind of network lag?

This is not related. Scores are set based on each standby lag. A standby
received some data faster than another one. Nothing more. But I admit this is
quite chatty in your logs...

> i attached the log could be useful to dig further.
> Can some guys point me on the right direction, should be really appreciate.

Unfortunately, there's nothing in your log that could explain the timeout.

Regards,