[ClusterLabs] unexpected fenced node and promotion of the new master PAF - postgres
kwenning at redhat.com
Tue Jul 13 08:29:38 EDT 2021
On Tue, Jul 13, 2021 at 1:43 PM damiano giuliani <
damianogiuliani87 at gmail.com> wrote:
> Hi guys,
> im back with some PAF postgres cluster problems.
> tonight the cluster fenced the master node and promote the PAF resource to
> a new node.
> everything went fine, unless i really dont know why.
> so this morning i noticed the old master was fenced by sbd and a new
> master was promoted, this happen tonight at 00.40.XX.
> filtering the logs i cant find out the any reasons why the old master was
> fenced and the start of promotion of the new master (which seems went
> perfectly), at certain point, im a bit lost cuz non of us can is able to
> get the real reason.
> the cluster worked flawessy for days with no issues, till now.
> crucial for me uderstand why this switch occured.
> a attached the current status and configuration and logs.
> on the old master node log cant find any reasons
> on the new master the only thing is the fencing and the promotion.
> could be this the reason of fencing?
> grep -e sbd /var/log/messages
> Jul 12 14:58:59 ltaoperdbs02 sbd: warning: inquisitor_child: Servant
> pcmk is outdated (age: 4)
> Jul 12 14:58:59 ltaoperdbs02 sbd: notice: inquisitor_child: Servant
> pcmk is healthy (age: 0)
That was yesterday afternoon and not 0:40 today in the morning.
With the watchdog-timeout set to 5s this may have been tight though.
Maybe check your other nodes for similar warnings - or check the compressed
Maybe you can as well check the journal of sbd after start to see if it
managed to run rt-scheduled.
Is this a bare-metal-setup or running on some hypervisor?
Unfortunately I'm not enough into postgres to tell if there is anything
interesting about the last
messages shown before the suspected watchdog-reboot.
Was there some administrative stuff done by ltauser before the reboot? If
> Any though and help is really appreciate.
> Manage your subscription:
> ClusterLabs home: https://www.clusterlabs.org/
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Users