[ClusterLabs] SBD restarted the node while pacemaker in maintenance mode
Klaus Wenninger
kwenning at redhat.com
Wed Jan 8 04:30:28 EST 2020
On 1/8/20 9:28 AM, Jerry Kross wrote:
> Thanks Klaus. Yes, I was able to reproduce the latency messages by
> inducing a network delay in the SBD VM and the node did not reboot.
> We also had a production issue where the primary node of a 2 node
> cluster was fenced when the primary node lost connectivity to 2 out of
> the 3 SBD disks. The error message is "Warning: inquisitor_child
> requested a reset"
Did the 2 cluster nodes loose connectivity to each other as well
simultaneously?
> The SBD configuration is integrated with the pacemaker cluster.The
> reboot would have happened
Just to assure we are talking of the same thing: When talking
about pacemaker integration I mean the '-P' option (default and
if given a 2nd time this means turn off - check presence of
'sbd: watcher: Pacemaker' & 'sbd: watcher: Cluster' sub-daemons -
and corosync.conf: quorum { ... two_node: 1 ...} of course in your
case to tell sbd it should rather count nodes instead of relying
on quorum).
> because of 2 events: 1) access was lost to 3 SBD disks , 2) Pacemaker
> regarded this node as
1) shouldn't trigger a reboot by itself as long as the nodes see each
other while 2) would of course trigger self-fencing.
> unhealthy (although this is not clear from the logs) But the
> triggering point was the loss of connectivity and am not sure if
> pacemaker regarded this node as unhealthy because the node lost
> connectivity to the 2 SBD disks.
Loosing 2 out of 3 disks should impose the same behavior as
loosing 1 disk in a single-disk setup.
reminding me to add test-case(s) to CI that verify the
disk-quorum behavior ;-)
> In such a scenario, Having 1 SBD device would be sufficient?
As already said with pacemaker-integration - principally yes.
Unless you have e.g. a setup with 3 disks at 3 sites and
2 nodes at 2 of these sites where you still want to provide
service while entirely loosing one of the node-sites.
To further assure we are on the same page some more
info about distribution, version/origin of sbd & pacemaker,
sbd & corosync config might be helpful.
Klaus
>
> Regards,
> JK
>
> On Tue, Jan 7, 2020 at 6:20 PM Klaus Wenninger <kwenning at redhat.com
> <mailto:kwenning at redhat.com>> wrote:
>
> On 1/6/20 8:40 AM, Jerry Kross wrote:
>> Hi Klaus,
>> Wishing you a great 2020!
> Same to you!
>> We're using 3 SBD disks with pacemaker integration. It just
>> happened once and am able to reproduce the latency error messages
>> in the system log by inducing a network delay in the VM that
>> hosts the SBD disks. These are the only messages that were logged
>> before the VM restarted.
> You mean you can reproduce the latency messages but they don't
> trigger a reboot - right?
>> From the SBD documentation, https://www.mankier.com/8/sbd., it
>> says that having 1 SBD disk does not introduce a single point of
>> failure. I also tested this configuration by offlining a disk and
>> pacemaker worked just fine. From your experience, is it safe to
>> run the cluster with one SBD disk? This is a 2 node Hana database
>> cluster, where one is primary. The data is replicated using the
>> native database tools. So, there's no shared DB storage and the
>> chances of a split-brain scenario is less likely to occur. This
>> is because, the secondary database does not accept any writes.
> When setup properly so that a node reboots if it looses
> its pacemaker-partner and the disk at the same time a 2-node
> cluster with SBD and a single disk should be safe to operate.
> As you already pointed out the disk isn't a SPOF as a node will
> still provide service as long as it sees the partner.
> Stating the obvious: Using just a single disk with pacemaker
> integration isn't raising the risk of split-brain but rather
> raises the risk of an unneeded node-reboot. So if your setup
> is likely to e.g. loose the connection between the
> partner-nodes and that to the disk simultaneously it may
> be interesting to have something like 3 disks a 3 sites or
> step away from 2-node-config in corosync in favor of real
> quorum using qdevice.
> I'm not very familiar with Hana-specific issue though.
>
> Klaus
>> Regards,
>> JK
>>
>>
>> On Thu, Jan 2, 2020 at 6:35 PM Klaus Wenninger
>> <kwenning at redhat.com <mailto:kwenning at redhat.com>> wrote:
>>
>> On 12/26/19 9:27 AM, Roger Zhou wrote:
>> > On 12/24/19 11:48 AM, Jerry Kross wrote:
>> >> Hi,
>> >> The pacemaker cluster manages a 2 node database cluster
>> configured to use 3
>> >> iscsi disk targets in its stonith configuration. The
>> pacemaker cluster was put
>> >> in maintenance mode but we see SBD writing to the system
>> logs. And just after
>> >> these logs, the production node was restarted.
>> >> Log:
>> >> sbd[5955]: warning: inquisitor_child: Latency: No
>> liveness for 37 s exceeds
>> >> threshold of 36 s (healthy servants: 1)
>> >> I see these messages logged and then the node was
>> restarted. I suspect if it
>> >> was the softdog module that restarted the node but I don't
>> see it in the logs.
>> Just to understand your config ...
>> You are using 3 block-devices with quorum amongst each other
>> without
>> pacemaker-integration - right?
>> Might be that the disk-watchers are hanging on some io so that
>> we don't see any logs from them.
>> Did that happen just once or can you reproduce the issue?
>> If you are not using pacemaker-integration so far that might be a
>> way to increase reliability. (If it sees the other node sbd
>> would be content
>> without getting response from the disks.) Of course it
>> depends on your
>> distribution
>> and sbd-version if that would be supported with a 2-node-cluster
>> (or at all). sbd e.g. would have to have at least
>> https://github.com/ClusterLabs/sbd/commit/4bd0a66da3ac9c9afaeb8a2468cdd3ed51ad3377
>>
>> Klaus
>> > sbd is too critical to share the io path with others.
>> >
>> > Very likely, the workload is too heavy, the iscsi
>> connections are broken and
>> > sbd looses the access to the disks, then sbd use sysrq 'b'
>> to reboot the node
>> > brutally and immediately.
>> >
>> > In regarding to watchdog-reboot, it kicks in when sbd is
>> not able to tickle it
>> > in time, eg. sbd starves for cpu, or is crashed. It is
>> crucial too, but not
>> > likely the case here.
>> >
>> > Merry X'mas and Happy New Year!
>> > Roger
>> >
>> > _______________________________________________
>> > Manage your subscription:
>> > https://lists.clusterlabs.org/mailman/listinfo/users
>> >
>> > ClusterLabs home: https://www.clusterlabs.org/
>>
>> _______________________________________________
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users
>>
>> ClusterLabs home: https://www.clusterlabs.org/
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20200108/b98b53fd/attachment-0001.html>
More information about the Users
mailing list