<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Feb 16, 2022 at 4:26 PM Klaus Wenninger <<a href="mailto:kwenning@redhat.com">kwenning@redhat.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Feb 16, 2022 at 3:09 PM Ulrich Windl <<a href="mailto:Ulrich.Windl@rz.uni-regensburg.de" target="_blank">Ulrich.Windl@rz.uni-regensburg.de</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi!<br>

<br>

When changing some FC cables I noticed that sbd complained 2 seconds after the connection went down (event though the device is multi-pathed with other paths being still up).<br>

I don't know any sbd parameter being set so low that after 2 seconds sbd would panic. Which parameter (if any) is responsible for that?<br>

<br>

In fact multipath takes up to 5 seconds to adjust paths.<br>

<br>

Here are some sample events (sbd-1.5.0+20210720.f4ca41f-3.6.1.x86_64 from SLES15 SP3):<br>

Feb 14 13:01:36 h18 kernel: qla2xxx [0000:41:00.0]-500b:3: LOOP DOWN detected (2 7 0 0).<br>

Feb 14 13:01:38 h18 sbd[6621]: /dev/disk/by-id/dm-name-SBD_1-3P2:    error: servant_md: slot read failed in servant.<br>

Feb 14 13:01:38 h18 sbd[6619]: /dev/disk/by-id/dm-name-SBD_1-3P1:    error: servant_md: mbox read failed in servant.<br>

Feb 14 13:01:40 h18 sbd[6615]:  warning: inquisitor_child: Servant /dev/disk/by-id/dm-name-SBD_1-3P1 is outdated (age: 11)<br>

Feb 14 13:01:40 h18 sbd[6615]:  warning: inquisitor_child: Servant /dev/disk/by-id/dm-name-SBD_1-3P2 is outdated (age: 11)<br>

Feb 14 13:01:40 h18 sbd[6615]:  warning: inquisitor_child: Majority of devices lost - surviving on pacemaker<br>

Feb 14 13:01:42 h18 kernel: sd 3:0:3:2: rejecting I/O to offline device<br>

Feb 14 13:01:42 h18 kernel: blk_update_request: I/O error, dev sdbt, sector 2048 op 0x0:(READ) flags 0x4200 phys_seg 1 prio class 1<br>

Feb 14 13:01:42 h18 kernel: device-mapper: multipath: 254:17: Failing path 68:112.<br>

Feb 14 13:01:42 h18 kernel: sd 3:0:1:2: rejecting I/O to offline device<br></blockquote></div></div></blockquote><div>Sry forgotten to address the following.</div><div><br></div><div>Guess your sbd-package predates</div><div><a href="https://github.com/ClusterLabs/sbd/commit/9e6cbbad9e259de374cbf41b713419c342528db1">https://github.com/ClusterLabs/sbd/commit/9e6cbbad9e259de374cbf41b713419c342528db1</a></div><div>and thus doesn't properly destroy the io-context using the aio-api.</div><div>This flaw has been in kind of since ever and I actually found it due to a kernel-issue that made</div><div>all block-io done the way sbd is doing it (aio + O_SYNC + O_DIRECT Actually never successfully</div><div>tracked it down to the real kernel issue playing with kprobes. But it was gone on the next kernel</div><div>update</div><div>) timeout.</div><div>Without survival on pacemaker it would have suicided after msgwait-timeout (10s in your case probably).</div><div>Would be interesting what happens if you raise msgwait-timeout to a value that would allow</div><div>another read attempt.</div><div>Does your setup actually recover? Could be possible that it doesn't missing the fix referenced above.</div><div><br></div><div>Regards,</div><div>Klaus </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

Most puzzling is the fact that sbd reports a problem 4 seconds before the kernel reports an I/O error. I guess sbd "times out" the pending read.<br></blockquote><div>Yep - that is timeout_io defaulting to 3s.</div><div>You can set it with -I daemon start parameter.</div><div>Together with the rest of the default-timeout-scheme the 3s do make sense.</div><div>Not sure but if you increase that significantly you might have to adapt other timeouts.</div><div>There are a certain number of checks regarding relationship of timeouts but they might not be exhaustive.</div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

The thing is: Both SBD disks are on different storage systems, each being connected by two separate FC fabrics, but still when disconnecting one cable from the host sbd panics.<br>

My guess is if "surviving on pacemaker" would not have happened, the node would be fenced; is that right?<br>

<br>

The other thing I wonder is the "outdated age":<br>

How can the age be 11 (seconds) when the disk was disconnected 4 seconds ago?<br>

It seems here the age is "current time - time_of_last read" instead of "current_time - time_when read_attempt_started".<br></blockquote><div>Exactly! And that is the correct way to do it as we need to record the time passed since last successful read.</div><div>There is no value in starting the clock when we start the read attempt as these attempts are not synced throughout</div><div>the cluster. </div><div><br></div><div>Regards,</div><div>Klaus</div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

Regards,<br>

Ulrich<br>

<br>

<br>

<br>

<br>

_______________________________________________<br>

Manage your subscription:<br>

<a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>

<br>

ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a><br>

<br>

</blockquote></div></div>

</blockquote></div></div>