<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Jul 14, 2021 at 12:50 PM damiano giuliani <<a href="mailto:damianogiuliani87@gmail.com">damianogiuliani87@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Hi guys, thanks for helping,<div><br></div><div>could be quite hard troubleshooting unexpected fails expecially if they are not easily tracked on the pacemaker / system logs.</div><div>all servers are baremetal , i requested the BMC logs hoping there are some informations.</div><div>you guys said the sbd is too tight, can you explain me and suggest a valid configuration?</div></div></blockquote><div><br></div><div>There is no one-fits-all configuration. If you are experiencing issues that sbd isn't able to timely</div><div>trigger the hardware-watchdog you can consider setting the watchdog-timeout value to a highter</div><div>number and consequently stonith-watchdog-timeout to about double that time.</div><div>But you should try to understand why your watchdog triggers and there aren't things systematically</div><div>going wrong - like e.g. sbd or corosync not being able to switch to rt-scheduling. </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><br></div><div>ps: yesterday i resyc the old master (to slave) and rejoined into the cluster.</div><div>i found the following error into the var/log/messages about the sbd</div><div><br></div><div> grep -r sbd messages<br>Jul 12 14:58:59 ltaoperdbs02 sbd[6107]: warning: inquisitor_child: Servant pcmk is outdated (age: 4)<br>Jul 12 14:58:59 ltaoperdbs02 sbd[6107]:  notice: inquisitor_child: Servant pcmk is healthy (age: 0)<br>Jul 13 20:42:14 ltaoperdbs02 sbd[185352]:  notice: main: Doing flush + writing 'b' to sysrq on timeout<br>Jul 13 20:42:14 ltaoperdbs02 sbd[185362]:      pcmk:   notice: servant_pcmk: Monitoring Pacemaker health<br>Jul 13 20:42:14 ltaoperdbs02 sbd[185363]:   cluster:   notice: servant_cluster: Monitoring unknown cluster health<br>Jul 13 20:42:15 ltaoperdbs02 sbd[185357]:  notice: inquisitor_child: Servant cluster is healthy (age: 0)<br>Jul 13 20:42:15 ltaoperdbs02 sbd[185357]:  notice: watchdog_init: Using watchdog device '/dev/watchdog'<br>Jul 13 20:42:19 ltaoperdbs02 sbd[185357]:  notice: inquisitor_child: Servant pcmk is healthy (age: 0)<br>Jul 13 20:53:57 ltaoperdbs02 sbd[188919]:    info: main: Verbose mode enabled.<br>Jul 13 20:53:57 ltaoperdbs02 sbd[188919]:    info: main: Watchdog enabled.<br>Jul 13 20:54:28 ltaoperdbs02 sbd[189176]:  notice: main: Doing flush + writing 'b' to sysrq on timeout<br>Jul 13 20:54:28 ltaoperdbs02 sbd[189178]:      pcmk:   notice: servant_pcmk: Monitoring Pacemaker health<br>Jul 13 20:54:28 ltaoperdbs02 sbd[189177]:  notice: inquisitor_child: Servant pcmk is healthy (age: 0)<br>Jul 13 20:54:28 ltaoperdbs02 sbd[189177]:   error: watchdog_init_fd: Cannot open watchdog device '/dev/watchdog': Device or resource busy (16)<br>Jul 13 20:54:28 ltaoperdbs02 sbd[189177]: warning: cleanup_servant_by_pid: Servant for pcmk (pid: 189178) has terminated<br>Jul 13 20:54:28 ltaoperdbs02 sbd[189177]: warning: cleanup_servant_by_pid: Servant for cluster (pid: 189179) has terminated<br>Jul 13 20:55:30 ltaoperdbs02 sbd[189484]:  notice: main: Doing flush + writing 'b' to sysrq on timeout<br>Jul 13 20:55:30 ltaoperdbs02 sbd[189484]:   error: watchdog_init_fd: Cannot open watchdog device '/dev/watchdog0': Device or resource busy (16)<br>Jul 13 20:55:30 ltaoperdbs02 sbd[189484]:   error: watchdog_init_fd: Cannot open watchdog device '/dev/watchdog': Device or resource busy (16)<br></div><div><br></div></div></blockquote><div>There is something strange going on so that sbd isn't able to open the watchdog-device.</div><div>Check that there is nobody else sitting on the watchdog-device - like systemd, watchdogd, with - iirc compile-time -</div><div>configuration corosync, ... Tools like 'lsof' may be helpful for that if you catch the system in that state.</div><div>I'm guessing it doesn't always happen because that should actually prevent a successful startup of</div><div>sbd and thus systemd shouldn't bring up pacemaker.</div><div>On the other hand competing for /dev/watchdog shouldn't introduce unexpected watchdog-reboots</div><div>as sbd will either fail opening the device and not come up thus or open the device and keep it open</div><div>for the time being so that nobody else is able to open it.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div></div><div>if i check the systemctl status sbd:</div><div><br></div><div>systemctl status sbd.service<br>● sbd.service - Shared-storage based fencing daemon<br>   Loaded: loaded (/usr/lib/systemd/system/sbd.service; enabled; vendor preset: disabled)<br>   Active: active (running) since Tue 2021-07-13 20:42:15 UTC; 13h ago<br>     Docs: man:sbd(8)<br>  Process: 185352 ExecStart=/usr/sbin/sbd $SBD_OPTS -p /var/run/sbd.pid watch (code=exited, status=0/SUCCESS)<br> Main PID: 185357 (sbd)<br>   CGroup: /system.slice/sbd.service<br>           ├─185357 sbd: inquisitor<br>           ├─185362 sbd: watcher: Pacemaker<br>           └─185363 sbd: watcher: Cluster<br><br>Jul 13 20:42:14 ltaoperdbs02 systemd[1]: Starting Shared-storage based fencing daemon...<br>Jul 13 20:42:14 ltaoperdbs02 sbd[185352]:   notice: main: Doing flush + writing 'b' to sysrq on timeout<br>Jul 13 20:42:14 ltaoperdbs02 sbd[185362]:       pcmk:   notice: servant_pcmk: Monitoring Pacemaker health<br>Jul 13 20:42:14 ltaoperdbs02 sbd[185363]:    cluster:   notice: servant_cluster: Monitoring unknown cluster health<br>Jul 13 20:42:15 ltaoperdbs02 sbd[185357]:   notice: inquisitor_child: Servant cluster is healthy (age: 0)<br>Jul 13 20:42:15 ltaoperdbs02 sbd[185357]:   notice: watchdog_init: Using watchdog device '/dev/watchdog'<br>Jul 13 20:42:15 ltaoperdbs02 systemd[1]: Started Shared-storage based fencing daemon.<br>Jul 13 20:42:19 ltaoperdbs02 sbd[185357]:   notice: inquisitor_child: Servant pcmk is healthy (age: 0)<br></div><div><br></div></div></blockquote><div>So at least for sbd there don't seem to be systematic issues switching to rt-scheduling</div><div>as we would see it moaning in the logs above.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div></div><div>this is happening to all 3 nodes, any toughts?</div><div><br></div><div>Thanks for helping, have as good day</div><div><br></div><div>Damiano</div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Il giorno mer 14 lug 2021 alle ore 10:08 Klaus Wenninger <<a href="mailto:kwenning@redhat.com" target="_blank">kwenning@redhat.com</a>> ha scritto:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Jul 14, 2021 at 6:40 AM Andrei Borzenkov <<a href="mailto:arvidjaar@gmail.com" target="_blank">arvidjaar@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 13.07.2021 23:09, damiano giuliani wrote:<br>

> Hi Klaus, thanks for helping, im quite lost because cant find out the<br>

> causes.<br>

> i attached the corosync logs of all three nodes hoping you guys can find<br>

> and hint me  something i cant see. i really appreciate the effort.<br>

> the old master log seems cutted at 00:38. so nothing interessing.<br>

> the new master and the third slave logged what its happened. but i cant<br>

> figure out the cause the old master went lost.<br>

> <br>

<br>

The reason it was lost is most likely outside of pacemaker. You need to<br>

check other logs on the node that was lost, may be BMC if this is bare<br>

metal or hypervisor if it is virtualized system.<br>

<br>

All that these logs say is that ltaoperdbs02 was lost from the point of<br>

view of two other nodes. It happened at the same time (around Jul 13<br>

00:40) which suggests ltaoperdbs02 had some problem indeed. Whether it<br>

was software crash, hardware failure or network outage cannot be<br>

determined from these logs.<br>

<br></blockquote><div>What speaks against a pure network-outage is that we don't see</div><div>the corosync memberhip messages on the node that died.</div><div>Of course it is possible that the log wasn't flushed out before reboot</div><div>but usually I'd expect that there would be enough time.</div><div>If something kept corosync or sbd from being scheduled that would</div><div>explain why we don't see messages from these instances.</div><div>And that was why I was asking to check if in the setup corosync and</div><div>sbd are able to switch to rt-scheduling.</div><div>But of course that is all speculations and from what we know it can</div><div>be merely anything from an administrative hard shutdown via</div><div>some BMC to whatever. </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

> something interessing could be the stonith logs of the new master and the<br>

> third slave:<br>

> <br>

> NEW MASTER:<br>

> grep stonith-ng /var/log/messages<br>

> Jul 13 00:40:37 ltaoperdbs03 stonith-ng[228696]:  notice: Node ltaoperdbs02<br>

> state is now lost<br>

> Jul 13 00:40:37 ltaoperdbs03 stonith-ng[228696]:  notice: Purged 1 peer<br>

> with id=1 and/or uname=ltaoperdbs02 from the membership cache<br>

> Jul 13 00:40:37 ltaoperdbs03 stonith-ng[228696]:  notice: Client<br>

> crmd.228700.154a9e50 wants to fence (reboot) 'ltaoperdbs02' with device<br>

> '(any)'<br>

> Jul 13 00:40:37 ltaoperdbs03 stonith-ng[228696]:  notice: Requesting peer<br>

> fencing (reboot) targeting ltaoperdbs02<br>

> Jul 13 00:40:37 ltaoperdbs03 stonith-ng[228696]:  notice: Couldn't find<br>

> anyone to fence (reboot) ltaoperdbs02 with any device<br>

> Jul 13 00:40:37 ltaoperdbs03 stonith-ng[228696]:  notice: Waiting 10s for<br>

> ltaoperdbs02 to self-fence (reboot) for client crmd.228700.f5d882d5<br>

> Jul 13 00:40:47 ltaoperdbs03 stonith-ng[228696]:  notice: Self-fencing<br>

> (reboot) by ltaoperdbs02 for<br>

> crmd.228700.f5d882d5-a804-4e20-bad4-7f16393d7748 assumed complete<br>

> Jul 13 00:40:47 ltaoperdbs03 stonith-ng[228696]:  notice: Operation<br>

> 'reboot' targeting ltaoperdbs02 on ltaoperdbs03 for<br>

> crmd.228700@ltaoperdbs03.f5d882d5: OK<br>

> <br>

> THIRD SLAVE:<br>

> grep stonith-ng /var/log/messages<br>

> Jul 13 00:40:37 ltaoperdbs04 stonith-ng[77928]:  notice: Node ltaoperdbs02<br>

> state is now lost<br>

> Jul 13 00:40:37 ltaoperdbs04 stonith-ng[77928]:  notice: Purged 1 peer with<br>

> id=1 and/or uname=ltaoperdbs02 from the membership cache<br>

> Jul 13 00:40:47 ltaoperdbs04 stonith-ng[77928]:  notice: Operation 'reboot'<br>

> targeting ltaoperdbs02 on ltaoperdbs03 for crmd.228700@ltaoperdbs03.f5d882d5:<br>

> OK<br>

> <br>

> i really appreciate the help and  what you think about it.<br>

> <br>

> PS the stonith should be set to 10s (pcs  property set<br>

> stonith-watchdog-timeout=10s) are u suggest different setting?<br>

> <br>

> Il giorno mar 13 lug 2021 alle ore 14:29 Klaus Wenninger <<br>

> <a href="mailto:kwenning@redhat.com" target="_blank">kwenning@redhat.com</a>> ha scritto:<br>

> <br>

>><br>

>><br>

>> On Tue, Jul 13, 2021 at 1:43 PM damiano giuliani <<br>

>> <a href="mailto:damianogiuliani87@gmail.com" target="_blank">damianogiuliani87@gmail.com</a>> wrote:<br>

>><br>

>>> Hi guys,<br>

>>> im back with some PAF postgres cluster problems.<br>

>>> tonight the cluster fenced the master node and promote the PAF resource<br>

>>> to a new node.<br>

>>> everything went fine, unless i really dont know why.<br>

>>> so this morning i noticed the old master was fenced by sbd and a new<br>

>>> master was promoted, this happen tonight at 00.40.XX.<br>

>>> filtering the logs i cant find out the any reasons why the old master was<br>

>>> fenced and the start of promotion of the new master (which seems went<br>

>>> perfectly), at certain point, im a bit lost cuz non of us can is able to<br>

>>> get the real reason.<br>

>>> the cluster worked flawessy for days  with no issues, till now.<br>

>>> crucial for me uderstand why this switch occured.<br>

>>><br>

>>> a attached the current status and configuration and logs.<br>

>>> on the old master node log cant find any reasons<br>

>>> on the new master the only thing is the fencing and the promotion.<br>

>>><br>

>>><br>

>>> PS:<br>

>>> could be this the reason of fencing?<br>

>>><br>

>>> grep  -e sbd /var/log/messages<br>

>>> Jul 12 14:58:59 ltaoperdbs02 sbd[6107]: warning: inquisitor_child:<br>

>>> Servant pcmk is outdated (age: 4)<br>

>>> Jul 12 14:58:59 ltaoperdbs02 sbd[6107]:  notice: inquisitor_child:<br>

>>> Servant pcmk is healthy (age: 0)<br>

>>><br>

>> That was yesterday afternoon and not 0:40 today in the morning.<br>

>> With the watchdog-timeout set to 5s this may have been tight though.<br>

>> Maybe check your other nodes for similar warnings - or check the<br>

>> compressed warnings.<br>

>> Maybe you can as well check the journal of sbd after start to see if it<br>

>> managed to run rt-scheduled.<br>

>> Is this a bare-metal-setup or running on some hypervisor?<br>

>> Unfortunately I'm not enough into postgres to tell if there is anything<br>

>> interesting about the last<br>

>> messages shown before the suspected watchdog-reboot.<br>

>> Was there some administrative stuff done by ltauser before the reboot? If<br>

>> yes what?<br>

>><br>

>> Regards,<br>

>> Klaus<br>

>><br>

>><br>

>>><br>

>>> Any though and help is really appreciate.<br>

>>><br>

>>> Damiano<br>

>>> _______________________________________________<br>

>>> Manage your subscription:<br>

>>> <a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>

>>><br>

>>> ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a><br>

>>><br>

>> _______________________________________________<br>

>> Manage your subscription:<br>

>> <a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>

>><br>

>> ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a><br>

>><br>

> <br>

> <br>

> _______________________________________________<br>

> Manage your subscription:<br>

> <a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>

> <br>

> ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a><br>

> <br>

<br>

_______________________________________________<br>

Manage your subscription:<br>

<a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>

<br>

ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a><br>

<br>

</blockquote></div></div>

_______________________________________________<br>

Manage your subscription:<br>

<a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>

<br>

ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a><br>

</blockquote></div>

_______________________________________________<br>

Manage your subscription:<br>

<a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>

<br>

ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a><br>

</blockquote></div></div>