[ClusterLabs] unexpected fenced node and promotion of the new master PAF - postgres

Klaus Wenninger kwenning at redhat.com
Tue Jul 13 08:29:38 EDT 2021


On Tue, Jul 13, 2021 at 1:43 PM damiano giuliani <
damianogiuliani87 at gmail.com> wrote:

> Hi guys,
> im back with some PAF postgres cluster problems.
> tonight the cluster fenced the master node and promote the PAF resource to
> a new node.
> everything went fine, unless i really dont know why.
> so this morning i noticed the old master was fenced by sbd and a new
> master was promoted, this happen tonight at 00.40.XX.
> filtering the logs i cant find out the any reasons why the old master was
> fenced and the start of promotion of the new master (which seems went
> perfectly), at certain point, im a bit lost cuz non of us can is able to
> get the real reason.
> the cluster worked flawessy for days  with no issues, till now.
> crucial for me uderstand why this switch occured.
>
> a attached the current status and configuration and logs.
> on the old master node log cant find any reasons
> on the new master the only thing is the fencing and the promotion.
>
>
> PS:
> could be this the reason of fencing?
>
> grep  -e sbd /var/log/messages
> Jul 12 14:58:59 ltaoperdbs02 sbd[6107]: warning: inquisitor_child: Servant
> pcmk is outdated (age: 4)
> Jul 12 14:58:59 ltaoperdbs02 sbd[6107]:  notice: inquisitor_child: Servant
> pcmk is healthy (age: 0)
>
That was yesterday afternoon and not 0:40 today in the morning.
With the watchdog-timeout set to 5s this may have been tight though.
Maybe check your other nodes for similar warnings - or check the compressed
warnings.
Maybe you can as well check the journal of sbd after start to see if it
managed to run rt-scheduled.
Is this a bare-metal-setup or running on some hypervisor?
Unfortunately I'm not enough into postgres to tell if there is anything
interesting about the last
messages shown before the suspected watchdog-reboot.
Was there some administrative stuff done by ltauser before the reboot? If
yes what?

Regards,
Klaus


>
> Any though and help is really appreciate.
>
> Damiano
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20210713/7d63cac2/attachment.htm>


More information about the Users mailing list