<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Feb 17, 2022 at 10:14 AM Ulrich Windl <<a href="mailto:Ulrich.Windl@rz.uni-regensburg.de">Ulrich.Windl@rz.uni-regensburg.de</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">>>> Klaus Wenninger <<a href="mailto:kwenning@redhat.com" target="_blank">kwenning@redhat.com</a>> schrieb am 16.02.2022 um 16:59 in<br>

Nachricht<br>

<CALrDAo0Lm9ue9p2L_Q=<a href="mailto:8oJ9_9r0ejEvXrUm8DUKL5q8D9yQy2w@mail.gmail.com" target="_blank">8oJ9_9r0ejEvXrUm8DUKL5q8D9yQy2w@mail.gmail.com</a>>:<br>

> On Wed, Feb 16, 2022 at 4:26 PM Klaus Wenninger <<a href="mailto:kwenning@redhat.com" target="_blank">kwenning@redhat.com</a>> wrote:<br>

> <br>

>><br>

>><br>

>> On Wed, Feb 16, 2022 at 3:09 PM Ulrich Windl <<br>

>> <a href="mailto:Ulrich.Windl@rz.uni-regensburg.de" target="_blank">Ulrich.Windl@rz.uni-regensburg.de</a>> wrote:<br>

>><br>

>>> Hi!<br>

>>><br>

>>> When changing some FC cables I noticed that sbd complained 2 seconds<br>

>>> after the connection went down (event though the device is multi-pathed<br>

>>> with other paths being still up).<br>

>>> I don't know any sbd parameter being set so low that after 2 seconds sbd<br>

>>> would panic. Which parameter (if any) is responsible for that?<br>

>>><br>

>>> In fact multipath takes up to 5 seconds to adjust paths.<br>

>>><br>

>>> Here are some sample events (sbd-1.5.0+20210720.f4ca41f-3.6.1.x86_64 from<br>

>>> SLES15 SP3):<br>

>>> Feb 14 13:01:36 h18 kernel: qla2xxx [0000:41:00.0]-500b:3: LOOP DOWN<br>

>>> detected (2 7 0 0).<br>

>>> Feb 14 13:01:38 h18 sbd[6621]: /dev/disk/by-id/dm-name-SBD_1-3P2:<br>

>>> error: servant_md: slot read failed in servant.<br>

>>> Feb 14 13:01:38 h18 sbd[6619]: /dev/disk/by-id/dm-name-SBD_1-3P1:<br>

>>> error: servant_md: mbox read failed in servant.<br>

>>> Feb 14 13:01:40 h18 sbd[6615]:  warning: inquisitor_child: Servant<br>

>>> /dev/disk/by-id/dm-name-SBD_1-3P1 is outdated (age: 11)<br>

>>> Feb 14 13:01:40 h18 sbd[6615]:  warning: inquisitor_child: Servant<br>

>>> /dev/disk/by-id/dm-name-SBD_1-3P2 is outdated (age: 11)<br>

>>> Feb 14 13:01:40 h18 sbd[6615]:  warning: inquisitor_child: Majority of<br>

>>> devices lost - surviving on pacemaker<br>

>>> Feb 14 13:01:42 h18 kernel: sd 3:0:3:2: rejecting I/O to offline device<br>

>>> Feb 14 13:01:42 h18 kernel: blk_update_request: I/O error, dev sdbt,<br>

>>> sector 2048 op 0x0:(READ) flags 0x4200 phys_seg 1 prio class 1<br>

>>> Feb 14 13:01:42 h18 kernel: device-mapper: multipath: 254:17: Failing<br>

>>> path 68:112.<br>

>>> Feb 14 13:01:42 h18 kernel: sd 3:0:1:2: rejecting I/O to offline device<br>

>>><br>

>> Sry forgotten to address the following.<br>

> <br>

> Guess your sbd-package predates<br>

> <a href="https://github.com/ClusterLabs/sbd/commit/9e6cbbad9e259de374cbf41b713419c342" rel="noreferrer" target="_blank">https://github.com/ClusterLabs/sbd/commit/9e6cbbad9e259de374cbf41b713419c342</a> <br>

> 528db1<br>

> and thus doesn't properly destroy the io-context using the aio-api.<br>

> This flaw has been in kind of since ever and I actually found it due to a<br>

> kernel-issue that made<br>

> all block-io done the way sbd is doing it (aio + O_SYNC + O_DIRECT Actually<br>

> never successfully<br>

> tracked it down to the real kernel issue playing with kprobes. But it was<br>

> gone on the next kernel<br>

> update<br>

> ) timeout.<br>

> Without survival on pacemaker it would have suicided after<br>

> msgwait-timeout (10s in your case probably).<br>

> Would be interesting what happens if you raise msgwait-timeout to a value<br>

> that would allow<br>

> another read attempt.<br>

> Does your setup actually recover? Could be possible that it doesn't missing<br>

> the fix referenced above.<br>

<br>

For completeness: Yes, sbd did recover:<br>

Feb 14 13:01:42 h18 sbd[6615]:  warning: cleanup_servant_by_pid: Servant for /dev/disk/by-id/dm-name-SBD_1-3P1 (pid: 6619) has terminated<br>

Feb 14 13:01:42 h18 sbd[6615]:  warning: cleanup_servant_by_pid: Servant for /dev/disk/by-id/dm-name-SBD_1-3P2 (pid: 6621) has terminated<br>

Feb 14 13:01:42 h18 sbd[31668]: /dev/disk/by-id/dm-name-SBD_1-3P1:   notice: servant_md: Monitoring slot 4 on disk /dev/disk/by-id/dm-name-SBD_1-3P1<br>

Feb 14 13:01:42 h18 sbd[31669]: /dev/disk/by-id/dm-name-SBD_1-3P2:   notice: servant_md: Monitoring slot 4 on disk /dev/disk/by-id/dm-name-SBD_1-3P2<br>

Feb 14 13:01:49 h18 sbd[6615]:   notice: inquisitor_child: Servant /dev/disk/by-id/dm-name-SBD_1-3P1 is healthy (age: 0)<br>

Feb 14 13:01:49 h18 sbd[6615]:   notice: inquisitor_child: Servant /dev/disk/by-id/dm-name-SBD_1-3P2 is healthy (age: 0)<br></blockquote><div><br></div><div>Good to see that!</div><div>Did you try several times?</div><div>I have some memory that when testing with the kernel mentioned before behavior</div><div>changed after a couple of timeouts and it wasn't able to create the read-request</div><div>anymore (without the fix mentioned) - assume some kind of resource depletion</div><div>due to previously hanging attempts not destroyed properly.</div><div>But that behavior might heavily depend on the kernel-version and as your attempts</div><div>do terminate with failure in the kernel some time later that might resolve issues</div><div>as well (in my case they would like hang forever from kernel pov).</div><div>As you seem to have sbd rebuilt anyway for your timer-checks ... would be</div><div>interesting if with the fix the delayed timeout-messages from the kernel would</div><div>disappear.</div><div><br></div><div>Regards,</div><div>Klaus</div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

Feb 14 13:02:10 h18 kernel: qla2xxx [0000:41:00.0]-500a:3: LOOP UP detected (8 Gbps).<br>

Feb 14 13:02:15 h18 multipathd[5180]: SBD_1-3P1: remaining active paths: 3<br>

Feb 14 13:02:15 h18 multipathd[5180]: SBD_1-3P1: sdbl - tur checker reports path is up<br>

Feb 14 13:02:15 h18 multipathd[5180]: SBD_1-3P1: remaining active paths: 4<br>

Feb 14 13:02:16 h18 multipathd[5180]: SBD_1-3P2: sdbu - tur checker reports path is up<br>

Feb 14 13:02:16 h18 multipathd[5180]: SBD_1-3P2: remaining active paths: 3<br>

Feb 14 13:02:16 h18 multipathd[5180]: SBD_1-3P2: sdbo - tur checker reports path is up<br>

Feb 14 13:02:16 h18 multipathd[5180]: SBD_1-3P2: remaining active paths: 4<br>

<br>

> <br>

> Regards,<br>

> Klaus<br>

> <br>

>><br>

>>> Most puzzling is the fact that sbd reports a problem 4 seconds before the<br>

>>> kernel reports an I/O error. I guess sbd "times out" the pending read.<br>

>>><br>

>> Yep - that is timeout_io defaulting to 3s.<br>

>> You can set it with -I daemon start parameter.<br>

>> Together with the rest of the default-timeout-scheme the 3s do make sense.<br>

>> Not sure but if you increase that significantly you might have to adapt<br>

>> other timeouts.<br>

>> There are a certain number of checks regarding relationship of timeouts<br>

>> but they might not be exhaustive.<br>

>><br>

>>><br>

>>> The thing is: Both SBD disks are on different storage systems, each being<br>

>>> connected by two separate FC fabrics, but still when disconnecting one<br>

>>> cable from the host sbd panics.<br>

>>> My guess is if "surviving on pacemaker" would not have happened, the node<br>

>>> would be fenced; is that right?<br>

>>><br>

>>> The other thing I wonder is the "outdated age":<br>

>>> How can the age be 11 (seconds) when the disk was disconnected 4 seconds<br>

>>> ago?<br>

>>> It seems here the age is "current time - time_of_last read" instead of<br>

>>> "current_time - time_when read_attempt_started".<br>

>>><br>

>> Exactly! And that is the correct way to do it as we need to record the<br>

>> time passed since last successful read.<br>

>> There is no value in starting the clock when we start the read attempt as<br>

>> these attempts are not synced throughout<br>

>> the cluster.<br>

>><br>

>> Regards,<br>

>> Klaus<br>

>><br>

>>><br>

>>> Regards,<br>

>>> Ulrich<br>

>>><br>

>>><br>

>>><br>

>>><br>

>>> _______________________________________________<br>

>>> Manage your subscription:<br>

>>> <a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a> <br>

>>><br>

>>> ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a> <br>

>>><br>

>>><br>

<br>

<br>

<br>

_______________________________________________<br>

Manage your subscription:<br>

<a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>

<br>

ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a><br>

<br>

</blockquote></div></div>