<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Jul 14, 2021 at 3:28 PM Ulrich Windl <<a href="mailto:Ulrich.Windl@rz.uni-regensburg.de">Ulrich.Windl@rz.uni-regensburg.de</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">>>> damiano giuliani <<a href="mailto:damianogiuliani87@gmail.com" target="_blank">damianogiuliani87@gmail.com</a>> schrieb am 14.07.2021 um<br>

12:49<br>

in Nachricht<br>

<CAG=<a href="mailto:zYNOjRmKC5az8nz2r82CRabJ3Z%2BGEnuW_8dE3UJFu1hD1hA@mail.gmail.com" target="_blank">zYNOjRmKC5az8nz2r82CRabJ3Z+GEnuW_8dE3UJFu1hD1hA@mail.gmail.com</a>>:<br>

> Hi guys, thanks for helping,<br>

> <br>

> could be quite hard troubleshooting unexpected fails expecially if they are<br>

> not easily tracked on the pacemaker / system logs.<br>

> all servers are baremetal , i requested the BMC logs hoping there are some<br>

> informations.<br>

> you guys said the sbd is too tight, can you explain me and suggest a valid<br>

> configuration?<br>

<br>

You must answer these questions for yourself:<br>

* What is the maximum read/write delay for your sbd device that still means<br>

the storage is working? Before assuming something like 1s also think of<br>

firmware updates, bad disk sectors, etc.<br></blockquote><div>stonith-watchdog-timeout set and no 'Servant starting for device' log - I guess no poison-pill-fencing then </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

* Then configure the sbd parameters accordingly<br>

* Finally configure the stonith timeout to be not less than the time sbd needs<br>

in worst case to down the machine. If the cluster starts recovering while the<br>

other node is not down already, you may have data corruption or other<br>

failures.<br></blockquote><div>yep - 2 * watchdog-timeout should be a good pick in this case </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

> <br>

> ps: yesterday i resyc the old master (to slave) and rejoined into the<br>

> cluster.<br>

> i found the following error into the var/log/messages about the sbd<br>

> <br>

>  grep -r sbd messages<br>

> Jul 12 14:58:59 ltaoperdbs02 sbd[6107]: warning: inquisitor_child: Servant<br>

> pcmk is outdated (age: 4)<br>

> Jul 12 14:58:59 ltaoperdbs02 sbd[6107]:  notice: inquisitor_child: Servant<br>

> pcmk is healthy (age: 0)<br>

> Jul 13 20:42:14 ltaoperdbs02 sbd[185352]:  notice: main: Doing flush +<br>

> writing 'b' to sysrq on timeout<br>

> Jul 13 20:42:14 ltaoperdbs02 sbd[185362]:      pcmk:   notice:<br>

> servant_pcmk: Monitoring Pacemaker health<br>

> Jul 13 20:42:14 ltaoperdbs02 sbd[185363]:   cluster:   notice:<br>

> servant_cluster: Monitoring unknown cluster health<br>

> Jul 13 20:42:15 ltaoperdbs02 sbd[185357]:  notice: inquisitor_child:<br>

> Servant cluster is healthy (age: 0)<br>

> Jul 13 20:42:15 ltaoperdbs02 sbd[185357]:  notice: watchdog_init: Using<br>

> watchdog device '/dev/watchdog'<br>

> Jul 13 20:42:19 ltaoperdbs02 sbd[185357]:  notice: inquisitor_child:<br>

> Servant pcmk is healthy (age: 0)<br>

> Jul 13 20:53:57 ltaoperdbs02 sbd[188919]:    info: main: Verbose mode<br>

> enabled.<br>

> Jul 13 20:53:57 ltaoperdbs02 sbd[188919]:    info: main: Watchdog enabled.<br>

> Jul 13 20:54:28 ltaoperdbs02 sbd[189176]:  notice: main: Doing flush +<br>

> writing 'b' to sysrq on timeout<br>

> Jul 13 20:54:28 ltaoperdbs02 sbd[189178]:      pcmk:   notice:<br>

> servant_pcmk: Monitoring Pacemaker health<br>

> Jul 13 20:54:28 ltaoperdbs02 sbd[189177]:  notice: inquisitor_child:<br>

> Servant pcmk is healthy (age: 0)<br>

> Jul 13 20:54:28 ltaoperdbs02 sbd[189177]:   error: watchdog_init_fd: Cannot<br>

> open watchdog device '/dev/watchdog': Device or resource busy (16)<br>

<br>

Maybe also debug the watchdog device.<br>

<br>

<br>

> Jul 13 20:54:28 ltaoperdbs02 sbd[189177]: warning: cleanup_servant_by_pid:<br>

> Servant for pcmk (pid: 189178) has terminated<br>

> Jul 13 20:54:28 ltaoperdbs02 sbd[189177]: warning: cleanup_servant_by_pid:<br>

> Servant for cluster (pid: 189179) has terminated<br>

> Jul 13 20:55:30 ltaoperdbs02 sbd[189484]:  notice: main: Doing flush +<br>

> writing 'b' to sysrq on timeout<br>

> Jul 13 20:55:30 ltaoperdbs02 sbd[189484]:   error: watchdog_init_fd: Cannot<br>

> open watchdog device '/dev/watchdog0': Device or resource busy (16)<br>

> Jul 13 20:55:30 ltaoperdbs02 sbd[189484]:   error: watchdog_init_fd: Cannot<br>

> open watchdog device '/dev/watchdog': Device or resource busy (16)<br>

> <br>

> if i check the systemctl status sbd:<br>

> <br>

> systemctl status sbd.service<br>

> ● sbd.service - Shared-storage based fencing daemon<br>

>    Loaded: loaded (/usr/lib/systemd/system/sbd.service; enabled; vendor<br>

> preset: disabled)<br>

>    Active: active (running) since Tue 2021-07-13 20:42:15 UTC; 13h ago<br>

>      Docs: man:sbd(8)<br>

>   Process: 185352 ExecStart=/usr/sbin/sbd $SBD_OPTS -p /var/run/sbd.pid<br>

> watch (code=exited, status=0/SUCCESS)<br>

>  Main PID: 185357 (sbd)<br>

>    CGroup: /system.slice/sbd.service<br>

>            ├─185357 sbd: inquisitor<br>

>            ├─185362 sbd: watcher: Pacemaker<br>

>            └─185363 sbd: watcher: Cluster<br>

> <br>

> Jul 13 20:42:14 ltaoperdbs02 systemd[1]: Starting Shared-storage based<br>

> fencing daemon...<br>

> Jul 13 20:42:14 ltaoperdbs02 sbd[185352]:   notice: main: Doing flush +<br>

> writing 'b' to sysrq on timeout<br>

> Jul 13 20:42:14 ltaoperdbs02 sbd[185362]:       pcmk:   notice:<br>

> servant_pcmk: Monitoring Pacemaker health<br>

> Jul 13 20:42:14 ltaoperdbs02 sbd[185363]:    cluster:   notice:<br>

> servant_cluster: Monitoring unknown cluster health<br>

> Jul 13 20:42:15 ltaoperdbs02 sbd[185357]:   notice: inquisitor_child:<br>

> Servant cluster is healthy (age: 0)<br>

> Jul 13 20:42:15 ltaoperdbs02 sbd[185357]:   notice: watchdog_init: Using<br>

> watchdog device '/dev/watchdog'<br>

> Jul 13 20:42:15 ltaoperdbs02 systemd[1]: Started Shared-storage based<br>

> fencing daemon.<br>

> Jul 13 20:42:19 ltaoperdbs02 sbd[185357]:   notice: inquisitor_child:<br>

> Servant pcmk is healthy (age: 0)<br>

> <br>

> this is happening to all 3 nodes, any toughts?<br>

<br>

Bad watchdog? <br>

<br>

> <br>

> Thanks for helping, have as good day<br>

> <br>

> Damiano<br>

> <br>

> <br>

> Il giorno mer 14 lug 2021 alle ore 10:08 Klaus Wenninger <<br>

> <a href="mailto:kwenning@redhat.com" target="_blank">kwenning@redhat.com</a>> ha scritto:<br>

> <br>

>><br>

>><br>

>> On Wed, Jul 14, 2021 at 6:40 AM Andrei Borzenkov <<a href="mailto:arvidjaar@gmail.com" target="_blank">arvidjaar@gmail.com</a>><br>

>> wrote:<br>

>><br>

>>> On 13.07.2021 23:09, damiano giuliani wrote:<br>

>>> > Hi Klaus, thanks for helping, im quite lost because cant find out the<br>

>>> > causes.<br>

>>> > i attached the corosync logs of all three nodes hoping you guys can<br>

find<br>

>>> > and hint me  something i cant see. i really appreciate the effort.<br>

>>> > the old master log seems cutted at 00:38. so nothing interessing.<br>

>>> > the new master and the third slave logged what its happened. but i cant<br>

>>> > figure out the cause the old master went lost.<br>

>>> ><br>

>>><br>

>>> The reason it was lost is most likely outside of pacemaker. You need to<br>

>>> check other logs on the node that was lost, may be BMC if this is bare<br>

>>> metal or hypervisor if it is virtualized system.<br>

>>><br>

>>> All that these logs say is that ltaoperdbs02 was lost from the point of<br>

>>> view of two other nodes. It happened at the same time (around Jul 13<br>

>>> 00:40) which suggests ltaoperdbs02 had some problem indeed. Whether it<br>

>>> was software crash, hardware failure or network outage cannot be<br>

>>> determined from these logs.<br>

>>><br>

>>> What speaks against a pure network-outage is that we don't see<br>

>> the corosync memberhip messages on the node that died.<br>

>> Of course it is possible that the log wasn't flushed out before reboot<br>

>> but usually I'd expect that there would be enough time.<br>

>> If something kept corosync or sbd from being scheduled that would<br>

>> explain why we don't see messages from these instances.<br>

>> And that was why I was asking to check if in the setup corosync and<br>

>> sbd are able to switch to rt-scheduling.<br>

>> But of course that is all speculations and from what we know it can<br>

>> be merely anything from an administrative hard shutdown via<br>

>> some BMC to whatever.<br>

>><br>

>>><br>

>>> > something interessing could be the stonith logs of the new master and<br>

>>> the<br>

>>> > third slave:<br>

>>> ><br>

>>> > NEW MASTER:<br>

>>> > grep stonith-ng /var/log/messages<br>

>>> > Jul 13 00:40:37 ltaoperdbs03 stonith-ng[228696]:  notice: Node<br>

>>> ltaoperdbs02<br>

>>> > state is now lost<br>

>>> > Jul 13 00:40:37 ltaoperdbs03 stonith-ng[228696]:  notice: Purged 1 peer<br>

>>> > with id=1 and/or uname=ltaoperdbs02 from the membership cache<br>

>>> > Jul 13 00:40:37 ltaoperdbs03 stonith-ng[228696]:  notice: Client<br>

>>> > crmd.228700.154a9e50 wants to fence (reboot) 'ltaoperdbs02' with device<br>

>>> > '(any)'<br>

>>> > Jul 13 00:40:37 ltaoperdbs03 stonith-ng[228696]:  notice: Requesting<br>

>>> peer<br>

>>> > fencing (reboot) targeting ltaoperdbs02<br>

>>> > Jul 13 00:40:37 ltaoperdbs03 stonith-ng[228696]:  notice: Couldn't find<br>

>>> > anyone to fence (reboot) ltaoperdbs02 with any device<br>

>>> > Jul 13 00:40:37 ltaoperdbs03 stonith-ng[228696]:  notice: Waiting 10s<br>

>>> for<br>

>>> > ltaoperdbs02 to self-fence (reboot) for client crmd.228700.f5d882d5<br>

>>> > Jul 13 00:40:47 ltaoperdbs03 stonith-ng[228696]:  notice: Self-fencing<br>

>>> > (reboot) by ltaoperdbs02 for<br>

>>> > crmd.228700.f5d882d5-a804-4e20-bad4-7f16393d7748 assumed complete<br>

>>> > Jul 13 00:40:47 ltaoperdbs03 stonith-ng[228696]:  notice: Operation<br>

>>> > 'reboot' targeting ltaoperdbs02 on ltaoperdbs03 for<br>

>>> > crmd.228700@ltaoperdbs03.f5d882d5: OK<br>

>>> ><br>

>>> > THIRD SLAVE:<br>

>>> > grep stonith-ng /var/log/messages<br>

>>> > Jul 13 00:40:37 ltaoperdbs04 stonith-ng[77928]:  notice: Node<br>

>>> ltaoperdbs02<br>

>>> > state is now lost<br>

>>> > Jul 13 00:40:37 ltaoperdbs04 stonith-ng[77928]:  notice: Purged 1 peer<br>

>>> with<br>

>>> > id=1 and/or uname=ltaoperdbs02 from the membership cache<br>

>>> > Jul 13 00:40:47 ltaoperdbs04 stonith-ng[77928]:  notice: Operation<br>

>>> 'reboot'<br>

>>> > targeting ltaoperdbs02 on ltaoperdbs03 for<br>

>>> crmd.228700@ltaoperdbs03.f5d882d5:<br>

>>> > OK<br>

>>> ><br>

>>> > i really appreciate the help and  what you think about it.<br>

>>> ><br>

>>> > PS the stonith should be set to 10s (pcs  property set<br>

>>> > stonith-watchdog-timeout=10s) are u suggest different setting?<br>

>>> ><br>

>>> > Il giorno mar 13 lug 2021 alle ore 14:29 Klaus Wenninger <<br>

>>> > <a href="mailto:kwenning@redhat.com" target="_blank">kwenning@redhat.com</a>> ha scritto:<br>

>>> ><br>

>>> >><br>

>>> >><br>

>>> >> On Tue, Jul 13, 2021 at 1:43 PM damiano giuliani <<br>

>>> >> <a href="mailto:damianogiuliani87@gmail.com" target="_blank">damianogiuliani87@gmail.com</a>> wrote:<br>

>>> >><br>

>>> >>> Hi guys,<br>

>>> >>> im back with some PAF postgres cluster problems.<br>

>>> >>> tonight the cluster fenced the master node and promote the PAF<br>

>>> resource<br>

>>> >>> to a new node.<br>

>>> >>> everything went fine, unless i really dont know why.<br>

>>> >>> so this morning i noticed the old master was fenced by sbd and a new<br>

>>> >>> master was promoted, this happen tonight at 00.40.XX.<br>

>>> >>> filtering the logs i cant find out the any reasons why the old master<br>

>>> was<br>

>>> >>> fenced and the start of promotion of the new master (which seems went<br>

>>> >>> perfectly), at certain point, im a bit lost cuz non of us can is able<br>

>>> to<br>

>>> >>> get the real reason.<br>

>>> >>> the cluster worked flawessy for days  with no issues, till now.<br>

>>> >>> crucial for me uderstand why this switch occured.<br>

>>> >>><br>

>>> >>> a attached the current status and configuration and logs.<br>

>>> >>> on the old master node log cant find any reasons<br>

>>> >>> on the new master the only thing is the fencing and the promotion.<br>

>>> >>><br>

>>> >>><br>

>>> >>> PS:<br>

>>> >>> could be this the reason of fencing?<br>

>>> >>><br>

>>> >>> grep  -e sbd /var/log/messages<br>

>>> >>> Jul 12 14:58:59 ltaoperdbs02 sbd[6107]: warning: inquisitor_child:<br>

>>> >>> Servant pcmk is outdated (age: 4)<br>

>>> >>> Jul 12 14:58:59 ltaoperdbs02 sbd[6107]:  notice: inquisitor_child:<br>

>>> >>> Servant pcmk is healthy (age: 0)<br>

>>> >>><br>

>>> >> That was yesterday afternoon and not 0:40 today in the morning.<br>

>>> >> With the watchdog-timeout set to 5s this may have been tight though.<br>

>>> >> Maybe check your other nodes for similar warnings - or check the<br>

>>> >> compressed warnings.<br>

>>> >> Maybe you can as well check the journal of sbd after start to see if<br>

it<br>

>>> >> managed to run rt-scheduled.<br>

>>> >> Is this a bare-metal-setup or running on some hypervisor?<br>

>>> >> Unfortunately I'm not enough into postgres to tell if there is<br>

anything<br>

>>> >> interesting about the last<br>

>>> >> messages shown before the suspected watchdog-reboot.<br>

>>> >> Was there some administrative stuff done by ltauser before the reboot?<br>

>>> If<br>

>>> >> yes what?<br>

>>> >><br>

>>> >> Regards,<br>

>>> >> Klaus<br>

>>> >><br>

>>> >><br>

>>> >>><br>

>>> >>> Any though and help is really appreciate.<br>

>>> >>><br>

>>> >>> Damiano<br>

>>> >>> _______________________________________________<br>

>>> >>> Manage your subscription:<br>

>>> >>> <a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a> <br>

>>> >>><br>

>>> >>> ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a> <br>

>>> >>><br>

>>> >> _______________________________________________<br>

>>> >> Manage your subscription:<br>

>>> >> <a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a> <br>

>>> >><br>

>>> >> ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a> <br>

>>> >><br>

>>> ><br>

>>> ><br>

>>> > _______________________________________________<br>

>>> > Manage your subscription:<br>

>>> > <a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a> <br>

>>> ><br>

>>> > ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a> <br>

>>> ><br>

>>><br>

>>> _______________________________________________<br>

>>> Manage your subscription:<br>

>>> <a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a> <br>

>>><br>

>>> ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a> <br>

>>><br>

>>> _______________________________________________<br>

>> Manage your subscription:<br>

>> <a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a> <br>

>><br>

>> ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a> <br>

>><br>

<br>

<br>

<br>

_______________________________________________<br>

Manage your subscription:<br>

<a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>

<br>

ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a><br>

</blockquote></div></div>