[ClusterLabs] Antw: [EXT] Re: Q: sbd: Which parameter controls "error: servant_md: slot read failed in servant."?

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Thu Feb 17 04:13:49 EST 2022


>>> Klaus Wenninger <kwenning at redhat.com> schrieb am 16.02.2022 um 16:59 in
Nachricht
<CALrDAo0Lm9ue9p2L_Q=8oJ9_9r0ejEvXrUm8DUKL5q8D9yQy2w at mail.gmail.com>:
> On Wed, Feb 16, 2022 at 4:26 PM Klaus Wenninger <kwenning at redhat.com> wrote:
> 
>>
>>
>> On Wed, Feb 16, 2022 at 3:09 PM Ulrich Windl <
>> Ulrich.Windl at rz.uni-regensburg.de> wrote:
>>
>>> Hi!
>>>
>>> When changing some FC cables I noticed that sbd complained 2 seconds
>>> after the connection went down (event though the device is multi-pathed
>>> with other paths being still up).
>>> I don't know any sbd parameter being set so low that after 2 seconds sbd
>>> would panic. Which parameter (if any) is responsible for that?
>>>
>>> In fact multipath takes up to 5 seconds to adjust paths.
>>>
>>> Here are some sample events (sbd-1.5.0+20210720.f4ca41f-3.6.1.x86_64 from
>>> SLES15 SP3):
>>> Feb 14 13:01:36 h18 kernel: qla2xxx [0000:41:00.0]-500b:3: LOOP DOWN
>>> detected (2 7 0 0).
>>> Feb 14 13:01:38 h18 sbd[6621]: /dev/disk/by-id/dm-name-SBD_1-3P2:
>>> error: servant_md: slot read failed in servant.
>>> Feb 14 13:01:38 h18 sbd[6619]: /dev/disk/by-id/dm-name-SBD_1-3P1:
>>> error: servant_md: mbox read failed in servant.
>>> Feb 14 13:01:40 h18 sbd[6615]:  warning: inquisitor_child: Servant
>>> /dev/disk/by-id/dm-name-SBD_1-3P1 is outdated (age: 11)
>>> Feb 14 13:01:40 h18 sbd[6615]:  warning: inquisitor_child: Servant
>>> /dev/disk/by-id/dm-name-SBD_1-3P2 is outdated (age: 11)
>>> Feb 14 13:01:40 h18 sbd[6615]:  warning: inquisitor_child: Majority of
>>> devices lost - surviving on pacemaker
>>> Feb 14 13:01:42 h18 kernel: sd 3:0:3:2: rejecting I/O to offline device
>>> Feb 14 13:01:42 h18 kernel: blk_update_request: I/O error, dev sdbt,
>>> sector 2048 op 0x0:(READ) flags 0x4200 phys_seg 1 prio class 1
>>> Feb 14 13:01:42 h18 kernel: device-mapper: multipath: 254:17: Failing
>>> path 68:112.
>>> Feb 14 13:01:42 h18 kernel: sd 3:0:1:2: rejecting I/O to offline device
>>>
>> Sry forgotten to address the following.
> 
> Guess your sbd-package predates
> https://github.com/ClusterLabs/sbd/commit/9e6cbbad9e259de374cbf41b713419c342 
> 528db1
> and thus doesn't properly destroy the io-context using the aio-api.
> This flaw has been in kind of since ever and I actually found it due to a
> kernel-issue that made
> all block-io done the way sbd is doing it (aio + O_SYNC + O_DIRECT Actually
> never successfully
> tracked it down to the real kernel issue playing with kprobes. But it was
> gone on the next kernel
> update
> ) timeout.
> Without survival on pacemaker it would have suicided after
> msgwait-timeout (10s in your case probably).
> Would be interesting what happens if you raise msgwait-timeout to a value
> that would allow
> another read attempt.
> Does your setup actually recover? Could be possible that it doesn't missing
> the fix referenced above.

For completeness: Yes, sbd did recover:
Feb 14 13:01:42 h18 sbd[6615]:  warning: cleanup_servant_by_pid: Servant for /dev/disk/by-id/dm-name-SBD_1-3P1 (pid: 6619) has terminated
Feb 14 13:01:42 h18 sbd[6615]:  warning: cleanup_servant_by_pid: Servant for /dev/disk/by-id/dm-name-SBD_1-3P2 (pid: 6621) has terminated
Feb 14 13:01:42 h18 sbd[31668]: /dev/disk/by-id/dm-name-SBD_1-3P1:   notice: servant_md: Monitoring slot 4 on disk /dev/disk/by-id/dm-name-SBD_1-3P1
Feb 14 13:01:42 h18 sbd[31669]: /dev/disk/by-id/dm-name-SBD_1-3P2:   notice: servant_md: Monitoring slot 4 on disk /dev/disk/by-id/dm-name-SBD_1-3P2
Feb 14 13:01:49 h18 sbd[6615]:   notice: inquisitor_child: Servant /dev/disk/by-id/dm-name-SBD_1-3P1 is healthy (age: 0)
Feb 14 13:01:49 h18 sbd[6615]:   notice: inquisitor_child: Servant /dev/disk/by-id/dm-name-SBD_1-3P2 is healthy (age: 0)

Feb 14 13:02:10 h18 kernel: qla2xxx [0000:41:00.0]-500a:3: LOOP UP detected (8 Gbps).
Feb 14 13:02:15 h18 multipathd[5180]: SBD_1-3P1: remaining active paths: 3
Feb 14 13:02:15 h18 multipathd[5180]: SBD_1-3P1: sdbl - tur checker reports path is up
Feb 14 13:02:15 h18 multipathd[5180]: SBD_1-3P1: remaining active paths: 4
Feb 14 13:02:16 h18 multipathd[5180]: SBD_1-3P2: sdbu - tur checker reports path is up
Feb 14 13:02:16 h18 multipathd[5180]: SBD_1-3P2: remaining active paths: 3
Feb 14 13:02:16 h18 multipathd[5180]: SBD_1-3P2: sdbo - tur checker reports path is up
Feb 14 13:02:16 h18 multipathd[5180]: SBD_1-3P2: remaining active paths: 4

> 
> Regards,
> Klaus
> 
>>
>>> Most puzzling is the fact that sbd reports a problem 4 seconds before the
>>> kernel reports an I/O error. I guess sbd "times out" the pending read.
>>>
>> Yep - that is timeout_io defaulting to 3s.
>> You can set it with -I daemon start parameter.
>> Together with the rest of the default-timeout-scheme the 3s do make sense.
>> Not sure but if you increase that significantly you might have to adapt
>> other timeouts.
>> There are a certain number of checks regarding relationship of timeouts
>> but they might not be exhaustive.
>>
>>>
>>> The thing is: Both SBD disks are on different storage systems, each being
>>> connected by two separate FC fabrics, but still when disconnecting one
>>> cable from the host sbd panics.
>>> My guess is if "surviving on pacemaker" would not have happened, the node
>>> would be fenced; is that right?
>>>
>>> The other thing I wonder is the "outdated age":
>>> How can the age be 11 (seconds) when the disk was disconnected 4 seconds
>>> ago?
>>> It seems here the age is "current time - time_of_last read" instead of
>>> "current_time - time_when read_attempt_started".
>>>
>> Exactly! And that is the correct way to do it as we need to record the
>> time passed since last successful read.
>> There is no value in starting the clock when we start the read attempt as
>> these attempts are not synced throughout
>> the cluster.
>>
>> Regards,
>> Klaus
>>
>>>
>>> Regards,
>>> Ulrich
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Manage your subscription:
>>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>>
>>> ClusterLabs home: https://www.clusterlabs.org/ 
>>>
>>>





More information about the Users mailing list