<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">On 1/8/20 9:28 AM, Jerry Kross wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAJ4ao1VG9UpLRxG5atsQQVhC_iUvT16_+Eih_LScw_HT6E4Raw@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">Thanks Klaus. Yes, I was able to reproduce the
latency messages by inducing a network delay in the SBD VM and
the node did not reboot.
<div>We also had a production issue where the primary node of a
2 node cluster was fenced when the primary node lost
connectivity to 2 out of the 3 SBD disks. The error message is
"Warning: inquisitor_child requested a reset"</div>
</div>
</blockquote>
Did the 2 cluster nodes loose connectivity to each other as well<br>
simultaneously?<br>
<blockquote type="cite"
cite="mid:CAJ4ao1VG9UpLRxG5atsQQVhC_iUvT16_+Eih_LScw_HT6E4Raw@mail.gmail.com">
<div dir="ltr">
<div>The SBD configuration is integrated with the pacemaker
cluster.The reboot would have happened </div>
</div>
</blockquote>
Just to assure we are talking of the same thing: When talking<br>
about pacemaker integration I mean the '-P' option (default and<br>
if given a 2nd time this means turn off - check presence of <br>
'sbd: watcher: Pacemaker' & 'sbd: watcher: Cluster' sub-daemons
-<br>
and corosync.conf: quorum { ... two_node: 1 ...} of course in your<br>
case to tell sbd it should rather count nodes instead of relying<br>
on quorum). <br>
<blockquote type="cite"
cite="mid:CAJ4ao1VG9UpLRxG5atsQQVhC_iUvT16_+Eih_LScw_HT6E4Raw@mail.gmail.com">
<div dir="ltr">
<div>because of 2 events: 1) access was lost to 3 SBD disks , 2)
Pacemaker regarded this node as </div>
</div>
</blockquote>
1) shouldn't trigger a reboot by itself as long as the nodes see
each<br>
other while 2) would of course trigger self-fencing.<br>
<blockquote type="cite"
cite="mid:CAJ4ao1VG9UpLRxG5atsQQVhC_iUvT16_+Eih_LScw_HT6E4Raw@mail.gmail.com">
<div dir="ltr">
<div>unhealthy (although this is not clear from the logs) But
the triggering point was the loss of connectivity and am not
sure if pacemaker regarded this node as unhealthy because the
node lost connectivity to the 2 SBD disks.</div>
</div>
</blockquote>
Loosing 2 out of 3 disks should impose the same behavior as<br>
loosing 1 disk in a single-disk setup.<br>
<br>
reminding me to add test-case(s) to CI that verify the<br>
disk-quorum behavior ;-)<br>
<blockquote type="cite"
cite="mid:CAJ4ao1VG9UpLRxG5atsQQVhC_iUvT16_+Eih_LScw_HT6E4Raw@mail.gmail.com">
<div dir="ltr">
<div>In such a scenario, Having 1 SBD device would be
sufficient?</div>
</div>
</blockquote>
As already said with pacemaker-integration - principally yes.<br>
Unless you have e.g. a setup with 3 disks at 3 sites and<br>
2 nodes at 2 of these sites where you still want to provide<br>
service while entirely loosing one of the node-sites.<br>
<br>
To further assure we are on the same page some more<br>
info about distribution, version/origin of sbd & pacemaker,<br>
sbd & corosync config might be helpful.<br>
<br>
<br>
Klaus<br>
<blockquote type="cite"
cite="mid:CAJ4ao1VG9UpLRxG5atsQQVhC_iUvT16_+Eih_LScw_HT6E4Raw@mail.gmail.com">
<div dir="ltr">
<div><br>
</div>
<div>Regards,</div>
<div>JK</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Tue, Jan 7, 2020 at 6:20 PM
Klaus Wenninger <<a href="mailto:kwenning@redhat.com"
moz-do-not-send="true">kwenning@redhat.com</a>> wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<div>On 1/6/20 8:40 AM, Jerry Kross wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Hi Klaus,
<div>Wishing you a great 2020!</div>
</div>
</blockquote>
Same to you!<br>
<blockquote type="cite">
<div dir="ltr">
<div>We're using 3 SBD disks with pacemaker integration.
It just happened once and am able to reproduce the
latency error messages in the system log by inducing a
network delay in the VM that hosts the SBD disks.
These are the only messages that were logged before
the VM restarted.</div>
</div>
</blockquote>
You mean you can reproduce the latency messages but they
don't<br>
trigger a reboot - right?<br>
<blockquote type="cite">
<div dir="ltr">
<div>From the SBD documentation, <a
href="https://www.mankier.com/8/sbd" target="_blank"
moz-do-not-send="true">https://www.mankier.com/8/sbd</a>.,
it says that having 1 SBD disk does not introduce a
single point of failure. I also tested this
configuration by offlining a disk and pacemaker worked
just fine. From your experience, is it safe to run the
cluster with one SBD disk? This is a 2 node Hana
database cluster, where one is primary. The data is
replicated using the native database tools. So,
there's no shared DB storage and the chances of a
split-brain scenario is less likely to occur. This is
because, the secondary database does not accept any
writes.</div>
</div>
</blockquote>
When setup properly so that a node reboots if it looses<br>
its pacemaker-partner and the disk at the same time a 2-node<br>
cluster with SBD and a single disk should be safe to
operate.<br>
As you already pointed out the disk isn't a SPOF as a node
will<br>
still provide service as long as it sees the partner.<br>
Stating the obvious: Using just a single disk with pacemaker<br>
integration isn't raising the risk of split-brain but rather<br>
raises the risk of an unneeded node-reboot. So if your setup<br>
is likely to e.g. loose the connection between the<br>
partner-nodes and that to the disk simultaneously it may<br>
be interesting to have something like 3 disks a 3 sites or<br>
step away from 2-node-config in corosync in favor of real<br>
quorum using qdevice.<br>
I'm not very familiar with Hana-specific issue though.<br>
<br>
Klaus<br>
<blockquote type="cite">
<div dir="ltr">
<div>Regards,</div>
<div>JK</div>
<div><br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Thu, Jan 2, 2020 at
6:35 PM Klaus Wenninger <<a
href="mailto:kwenning@redhat.com" target="_blank"
moz-do-not-send="true">kwenning@redhat.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px
0px 0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">On 12/26/19 9:27
AM, Roger Zhou wrote:<br>
> On 12/24/19 11:48 AM, Jerry Kross wrote:<br>
>> Hi,<br>
>> The pacemaker cluster manages a 2 node
database cluster configured to use 3 <br>
>> iscsi disk targets in its stonith
configuration. The pacemaker cluster was put <br>
>> in maintenance mode but we see SBD writing to
the system logs. And just after <br>
>> these logs, the production node was
restarted.<br>
>> Log:<br>
>> sbd[5955]: warning: inquisitor_child:
Latency: No liveness for 37 s exceeds <br>
>> threshold of 36 s (healthy servants: 1)<br>
>> I see these messages logged and then the node
was restarted. I suspect if it <br>
>> was the softdog module that restarted the
node but I don't see it in the logs. <br>
Just to understand your config ...<br>
You are using 3 block-devices with quorum amongst each
other without<br>
pacemaker-integration - right?<br>
Might be that the disk-watchers are hanging on some io
so that<br>
we don't see any logs from them.<br>
Did that happen just once or can you reproduce the
issue?<br>
If you are not using pacemaker-integration so far that
might be a<br>
way to increase reliability. (If it sees the other
node sbd would be content<br>
without getting response from the disks.) Of course it
depends on your<br>
distribution<br>
and sbd-version if that would be supported with a
2-node-cluster<br>
(or at all). sbd e.g. would have to have at least<br>
<a
href="https://github.com/ClusterLabs/sbd/commit/4bd0a66da3ac9c9afaeb8a2468cdd3ed51ad3377"
rel="noreferrer" target="_blank"
moz-do-not-send="true">https://github.com/ClusterLabs/sbd/commit/4bd0a66da3ac9c9afaeb8a2468cdd3ed51ad3377</a><br>
<br>
Klaus <br>
> sbd is too critical to share the io path with
others.<br>
><br>
> Very likely, the workload is too heavy, the iscsi
connections are broken and <br>
> sbd looses the access to the disks, then sbd use
sysrq 'b' to reboot the node <br>
> brutally and immediately.<br>
><br>
> In regarding to watchdog-reboot, it kicks in when
sbd is not able to tickle it <br>
> in time, eg. sbd starves for cpu, or is crashed.
It is crucial too, but not <br>
> likely the case here.<br>
><br>
> Merry X'mas and Happy New Year!<br>
> Roger<br>
><br>
> _______________________________________________<br>
> Manage your subscription:<br>
> <a
href="https://lists.clusterlabs.org/mailman/listinfo/users"
rel="noreferrer" target="_blank"
moz-do-not-send="true">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>
><br>
> ClusterLabs home: <a
href="https://www.clusterlabs.org/" rel="noreferrer"
target="_blank" moz-do-not-send="true">https://www.clusterlabs.org/</a><br>
<br>
_______________________________________________<br>
Manage your subscription:<br>
<a
href="https://lists.clusterlabs.org/mailman/listinfo/users"
rel="noreferrer" target="_blank"
moz-do-not-send="true">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>
<br>
ClusterLabs home: <a
href="https://www.clusterlabs.org/" rel="noreferrer"
target="_blank" moz-do-not-send="true">https://www.clusterlabs.org/</a></blockquote>
</div>
</blockquote>
<br>
</div>
</blockquote>
</div>
</blockquote>
<br>
</body>
</html>