[ClusterLabs] unexpected fenced node and promotion of the new master PAF - postgres

damiano giuliani damianogiuliani87 at gmail.com
Wed Jul 14 06:49:40 EDT 2021


Hi guys, thanks for helping,

could be quite hard troubleshooting unexpected fails expecially if they are
not easily tracked on the pacemaker / system logs.
all servers are baremetal , i requested the BMC logs hoping there are some
informations.
you guys said the sbd is too tight, can you explain me and suggest a valid
configuration?

ps: yesterday i resyc the old master (to slave) and rejoined into the
cluster.
i found the following error into the var/log/messages about the sbd

 grep -r sbd messages
Jul 12 14:58:59 ltaoperdbs02 sbd[6107]: warning: inquisitor_child: Servant
pcmk is outdated (age: 4)
Jul 12 14:58:59 ltaoperdbs02 sbd[6107]:  notice: inquisitor_child: Servant
pcmk is healthy (age: 0)
Jul 13 20:42:14 ltaoperdbs02 sbd[185352]:  notice: main: Doing flush +
writing 'b' to sysrq on timeout
Jul 13 20:42:14 ltaoperdbs02 sbd[185362]:      pcmk:   notice:
servant_pcmk: Monitoring Pacemaker health
Jul 13 20:42:14 ltaoperdbs02 sbd[185363]:   cluster:   notice:
servant_cluster: Monitoring unknown cluster health
Jul 13 20:42:15 ltaoperdbs02 sbd[185357]:  notice: inquisitor_child:
Servant cluster is healthy (age: 0)
Jul 13 20:42:15 ltaoperdbs02 sbd[185357]:  notice: watchdog_init: Using
watchdog device '/dev/watchdog'
Jul 13 20:42:19 ltaoperdbs02 sbd[185357]:  notice: inquisitor_child:
Servant pcmk is healthy (age: 0)
Jul 13 20:53:57 ltaoperdbs02 sbd[188919]:    info: main: Verbose mode
enabled.
Jul 13 20:53:57 ltaoperdbs02 sbd[188919]:    info: main: Watchdog enabled.
Jul 13 20:54:28 ltaoperdbs02 sbd[189176]:  notice: main: Doing flush +
writing 'b' to sysrq on timeout
Jul 13 20:54:28 ltaoperdbs02 sbd[189178]:      pcmk:   notice:
servant_pcmk: Monitoring Pacemaker health
Jul 13 20:54:28 ltaoperdbs02 sbd[189177]:  notice: inquisitor_child:
Servant pcmk is healthy (age: 0)
Jul 13 20:54:28 ltaoperdbs02 sbd[189177]:   error: watchdog_init_fd: Cannot
open watchdog device '/dev/watchdog': Device or resource busy (16)
Jul 13 20:54:28 ltaoperdbs02 sbd[189177]: warning: cleanup_servant_by_pid:
Servant for pcmk (pid: 189178) has terminated
Jul 13 20:54:28 ltaoperdbs02 sbd[189177]: warning: cleanup_servant_by_pid:
Servant for cluster (pid: 189179) has terminated
Jul 13 20:55:30 ltaoperdbs02 sbd[189484]:  notice: main: Doing flush +
writing 'b' to sysrq on timeout
Jul 13 20:55:30 ltaoperdbs02 sbd[189484]:   error: watchdog_init_fd: Cannot
open watchdog device '/dev/watchdog0': Device or resource busy (16)
Jul 13 20:55:30 ltaoperdbs02 sbd[189484]:   error: watchdog_init_fd: Cannot
open watchdog device '/dev/watchdog': Device or resource busy (16)

if i check the systemctl status sbd:

systemctl status sbd.service
● sbd.service - Shared-storage based fencing daemon
   Loaded: loaded (/usr/lib/systemd/system/sbd.service; enabled; vendor
preset: disabled)
   Active: active (running) since Tue 2021-07-13 20:42:15 UTC; 13h ago
     Docs: man:sbd(8)
  Process: 185352 ExecStart=/usr/sbin/sbd $SBD_OPTS -p /var/run/sbd.pid
watch (code=exited, status=0/SUCCESS)
 Main PID: 185357 (sbd)
   CGroup: /system.slice/sbd.service
           ├─185357 sbd: inquisitor
           ├─185362 sbd: watcher: Pacemaker
           └─185363 sbd: watcher: Cluster

Jul 13 20:42:14 ltaoperdbs02 systemd[1]: Starting Shared-storage based
fencing daemon...
Jul 13 20:42:14 ltaoperdbs02 sbd[185352]:   notice: main: Doing flush +
writing 'b' to sysrq on timeout
Jul 13 20:42:14 ltaoperdbs02 sbd[185362]:       pcmk:   notice:
servant_pcmk: Monitoring Pacemaker health
Jul 13 20:42:14 ltaoperdbs02 sbd[185363]:    cluster:   notice:
servant_cluster: Monitoring unknown cluster health
Jul 13 20:42:15 ltaoperdbs02 sbd[185357]:   notice: inquisitor_child:
Servant cluster is healthy (age: 0)
Jul 13 20:42:15 ltaoperdbs02 sbd[185357]:   notice: watchdog_init: Using
watchdog device '/dev/watchdog'
Jul 13 20:42:15 ltaoperdbs02 systemd[1]: Started Shared-storage based
fencing daemon.
Jul 13 20:42:19 ltaoperdbs02 sbd[185357]:   notice: inquisitor_child:
Servant pcmk is healthy (age: 0)

this is happening to all 3 nodes, any toughts?

Thanks for helping, have as good day

Damiano


Il giorno mer 14 lug 2021 alle ore 10:08 Klaus Wenninger <
kwenning at redhat.com> ha scritto:

>
>
> On Wed, Jul 14, 2021 at 6:40 AM Andrei Borzenkov <arvidjaar at gmail.com>
> wrote:
>
>> On 13.07.2021 23:09, damiano giuliani wrote:
>> > Hi Klaus, thanks for helping, im quite lost because cant find out the
>> > causes.
>> > i attached the corosync logs of all three nodes hoping you guys can find
>> > and hint me  something i cant see. i really appreciate the effort.
>> > the old master log seems cutted at 00:38. so nothing interessing.
>> > the new master and the third slave logged what its happened. but i cant
>> > figure out the cause the old master went lost.
>> >
>>
>> The reason it was lost is most likely outside of pacemaker. You need to
>> check other logs on the node that was lost, may be BMC if this is bare
>> metal or hypervisor if it is virtualized system.
>>
>> All that these logs say is that ltaoperdbs02 was lost from the point of
>> view of two other nodes. It happened at the same time (around Jul 13
>> 00:40) which suggests ltaoperdbs02 had some problem indeed. Whether it
>> was software crash, hardware failure or network outage cannot be
>> determined from these logs.
>>
>> What speaks against a pure network-outage is that we don't see
> the corosync memberhip messages on the node that died.
> Of course it is possible that the log wasn't flushed out before reboot
> but usually I'd expect that there would be enough time.
> If something kept corosync or sbd from being scheduled that would
> explain why we don't see messages from these instances.
> And that was why I was asking to check if in the setup corosync and
> sbd are able to switch to rt-scheduling.
> But of course that is all speculations and from what we know it can
> be merely anything from an administrative hard shutdown via
> some BMC to whatever.
>
>>
>> > something interessing could be the stonith logs of the new master and
>> the
>> > third slave:
>> >
>> > NEW MASTER:
>> > grep stonith-ng /var/log/messages
>> > Jul 13 00:40:37 ltaoperdbs03 stonith-ng[228696]:  notice: Node
>> ltaoperdbs02
>> > state is now lost
>> > Jul 13 00:40:37 ltaoperdbs03 stonith-ng[228696]:  notice: Purged 1 peer
>> > with id=1 and/or uname=ltaoperdbs02 from the membership cache
>> > Jul 13 00:40:37 ltaoperdbs03 stonith-ng[228696]:  notice: Client
>> > crmd.228700.154a9e50 wants to fence (reboot) 'ltaoperdbs02' with device
>> > '(any)'
>> > Jul 13 00:40:37 ltaoperdbs03 stonith-ng[228696]:  notice: Requesting
>> peer
>> > fencing (reboot) targeting ltaoperdbs02
>> > Jul 13 00:40:37 ltaoperdbs03 stonith-ng[228696]:  notice: Couldn't find
>> > anyone to fence (reboot) ltaoperdbs02 with any device
>> > Jul 13 00:40:37 ltaoperdbs03 stonith-ng[228696]:  notice: Waiting 10s
>> for
>> > ltaoperdbs02 to self-fence (reboot) for client crmd.228700.f5d882d5
>> > Jul 13 00:40:47 ltaoperdbs03 stonith-ng[228696]:  notice: Self-fencing
>> > (reboot) by ltaoperdbs02 for
>> > crmd.228700.f5d882d5-a804-4e20-bad4-7f16393d7748 assumed complete
>> > Jul 13 00:40:47 ltaoperdbs03 stonith-ng[228696]:  notice: Operation
>> > 'reboot' targeting ltaoperdbs02 on ltaoperdbs03 for
>> > crmd.228700 at ltaoperdbs03.f5d882d5: OK
>> >
>> > THIRD SLAVE:
>> > grep stonith-ng /var/log/messages
>> > Jul 13 00:40:37 ltaoperdbs04 stonith-ng[77928]:  notice: Node
>> ltaoperdbs02
>> > state is now lost
>> > Jul 13 00:40:37 ltaoperdbs04 stonith-ng[77928]:  notice: Purged 1 peer
>> with
>> > id=1 and/or uname=ltaoperdbs02 from the membership cache
>> > Jul 13 00:40:47 ltaoperdbs04 stonith-ng[77928]:  notice: Operation
>> 'reboot'
>> > targeting ltaoperdbs02 on ltaoperdbs03 for
>> crmd.228700 at ltaoperdbs03.f5d882d5:
>> > OK
>> >
>> > i really appreciate the help and  what you think about it.
>> >
>> > PS the stonith should be set to 10s (pcs  property set
>> > stonith-watchdog-timeout=10s) are u suggest different setting?
>> >
>> > Il giorno mar 13 lug 2021 alle ore 14:29 Klaus Wenninger <
>> > kwenning at redhat.com> ha scritto:
>> >
>> >>
>> >>
>> >> On Tue, Jul 13, 2021 at 1:43 PM damiano giuliani <
>> >> damianogiuliani87 at gmail.com> wrote:
>> >>
>> >>> Hi guys,
>> >>> im back with some PAF postgres cluster problems.
>> >>> tonight the cluster fenced the master node and promote the PAF
>> resource
>> >>> to a new node.
>> >>> everything went fine, unless i really dont know why.
>> >>> so this morning i noticed the old master was fenced by sbd and a new
>> >>> master was promoted, this happen tonight at 00.40.XX.
>> >>> filtering the logs i cant find out the any reasons why the old master
>> was
>> >>> fenced and the start of promotion of the new master (which seems went
>> >>> perfectly), at certain point, im a bit lost cuz non of us can is able
>> to
>> >>> get the real reason.
>> >>> the cluster worked flawessy for days  with no issues, till now.
>> >>> crucial for me uderstand why this switch occured.
>> >>>
>> >>> a attached the current status and configuration and logs.
>> >>> on the old master node log cant find any reasons
>> >>> on the new master the only thing is the fencing and the promotion.
>> >>>
>> >>>
>> >>> PS:
>> >>> could be this the reason of fencing?
>> >>>
>> >>> grep  -e sbd /var/log/messages
>> >>> Jul 12 14:58:59 ltaoperdbs02 sbd[6107]: warning: inquisitor_child:
>> >>> Servant pcmk is outdated (age: 4)
>> >>> Jul 12 14:58:59 ltaoperdbs02 sbd[6107]:  notice: inquisitor_child:
>> >>> Servant pcmk is healthy (age: 0)
>> >>>
>> >> That was yesterday afternoon and not 0:40 today in the morning.
>> >> With the watchdog-timeout set to 5s this may have been tight though.
>> >> Maybe check your other nodes for similar warnings - or check the
>> >> compressed warnings.
>> >> Maybe you can as well check the journal of sbd after start to see if it
>> >> managed to run rt-scheduled.
>> >> Is this a bare-metal-setup or running on some hypervisor?
>> >> Unfortunately I'm not enough into postgres to tell if there is anything
>> >> interesting about the last
>> >> messages shown before the suspected watchdog-reboot.
>> >> Was there some administrative stuff done by ltauser before the reboot?
>> If
>> >> yes what?
>> >>
>> >> Regards,
>> >> Klaus
>> >>
>> >>
>> >>>
>> >>> Any though and help is really appreciate.
>> >>>
>> >>> Damiano
>> >>> _______________________________________________
>> >>> Manage your subscription:
>> >>> https://lists.clusterlabs.org/mailman/listinfo/users
>> >>>
>> >>> ClusterLabs home: https://www.clusterlabs.org/
>> >>>
>> >> _______________________________________________
>> >> Manage your subscription:
>> >> https://lists.clusterlabs.org/mailman/listinfo/users
>> >>
>> >> ClusterLabs home: https://www.clusterlabs.org/
>> >>
>> >
>> >
>> > _______________________________________________
>> > Manage your subscription:
>> > https://lists.clusterlabs.org/mailman/listinfo/users
>> >
>> > ClusterLabs home: https://www.clusterlabs.org/
>> >
>>
>> _______________________________________________
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users
>>
>> ClusterLabs home: https://www.clusterlabs.org/
>>
>> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20210714/4cb48c2d/attachment.htm>


More information about the Users mailing list