<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <div class="moz-cite-prefix">On 1/8/20 9:28 AM, Jerry Kross wrote:<br>

    </div>

    <blockquote type="cite"

cite="mid:CAJ4ao1VG9UpLRxG5atsQQVhC_iUvT16_+Eih_LScw_HT6E4Raw@mail.gmail.com">

      <meta http-equiv="content-type" content="text/html; charset=UTF-8">

      <div dir="ltr">Thanks Klaus. Yes, I was able to reproduce the

        latency messages by inducing a network delay in the SBD VM and

        the node did not reboot.

        <div>We also had a production issue where the primary node of a

          2 node cluster was fenced when the primary node lost

          connectivity to 2 out of the 3 SBD disks. The error message is

          "Warning: inquisitor_child requested a reset"</div>

      </div>

    </blockquote>

    Did the 2 cluster nodes loose connectivity to each other as well<br>

    simultaneously?<br>

    <blockquote type="cite"

cite="mid:CAJ4ao1VG9UpLRxG5atsQQVhC_iUvT16_+Eih_LScw_HT6E4Raw@mail.gmail.com">

      <div dir="ltr">

        <div>The SBD configuration is integrated with the pacemaker

          cluster.The reboot would have happened </div>

      </div>

    </blockquote>

    Just to assure we are talking of the same thing: When talking<br>

    about pacemaker integration I mean the '-P' option (default and<br>

    if given a 2nd time this means turn off - check presence of <br>

    'sbd: watcher: Pacemaker' & 'sbd: watcher: Cluster' sub-daemons

    -<br>

    and corosync.conf: quorum { ... two_node: 1 ...} of course in your<br>

    case to tell sbd it should rather count nodes instead of relying<br>

    on quorum). <br>

    <blockquote type="cite"

cite="mid:CAJ4ao1VG9UpLRxG5atsQQVhC_iUvT16_+Eih_LScw_HT6E4Raw@mail.gmail.com">

      <div dir="ltr">

        <div>because of 2 events: 1) access was lost to 3 SBD disks , 2)

          Pacemaker regarded this node as </div>

      </div>

    </blockquote>

    1) shouldn't trigger a reboot by itself as long as the nodes see

    each<br>

    other while 2) would of course trigger self-fencing.<br>

    <blockquote type="cite"

cite="mid:CAJ4ao1VG9UpLRxG5atsQQVhC_iUvT16_+Eih_LScw_HT6E4Raw@mail.gmail.com">

      <div dir="ltr">

        <div>unhealthy (although this is not clear from the logs) But

          the triggering point was the loss of connectivity and am not

          sure if pacemaker regarded this node as unhealthy because the

          node lost connectivity to the 2 SBD disks.</div>

      </div>

    </blockquote>

    Loosing 2 out of 3 disks should impose the same behavior as<br>

    loosing 1 disk in a single-disk setup.<br>

    <br>

    reminding me to add test-case(s) to CI that verify the<br>

    disk-quorum behavior ;-)<br>

    <blockquote type="cite"

cite="mid:CAJ4ao1VG9UpLRxG5atsQQVhC_iUvT16_+Eih_LScw_HT6E4Raw@mail.gmail.com">

      <div dir="ltr">

        <div>In such a scenario, Having 1 SBD device would be

          sufficient?</div>

      </div>

    </blockquote>

    As already said with pacemaker-integration - principally yes.<br>

    Unless you have e.g. a setup with 3 disks at 3 sites and<br>

    2 nodes at 2 of these sites where you still want to provide<br>

    service while entirely loosing one of the node-sites.<br>

    <br>

    To further assure we are on the same page some more<br>

    info about distribution, version/origin of sbd & pacemaker,<br>

    sbd & corosync config might be helpful.<br>

    <br>

    <br>

    Klaus<br>

    <blockquote type="cite"

cite="mid:CAJ4ao1VG9UpLRxG5atsQQVhC_iUvT16_+Eih_LScw_HT6E4Raw@mail.gmail.com">

      <div dir="ltr">

        <div><br>

        </div>

        <div>Regards,</div>

        <div>JK</div>

      </div>

      <br>

      <div class="gmail_quote">

        <div dir="ltr" class="gmail_attr">On Tue, Jan 7, 2020 at 6:20 PM

          Klaus Wenninger <<a href="mailto:kwenning@redhat.com"

            moz-do-not-send="true">kwenning@redhat.com</a>> wrote:<br>

        </div>

        <blockquote class="gmail_quote" style="margin:0px 0px 0px

          0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

          <div bgcolor="#FFFFFF">

            <div>On 1/6/20 8:40 AM, Jerry Kross wrote:<br>

            </div>

            <blockquote type="cite">

              <div dir="ltr">Hi Klaus,

                <div>Wishing you a great 2020!</div>

              </div>

            </blockquote>

            Same to you!<br>

            <blockquote type="cite">

              <div dir="ltr">

                <div>We're using 3 SBD disks with pacemaker integration.

                  It just happened once and am able to reproduce the

                  latency error messages in the system log by inducing a

                  network delay in the VM that hosts the SBD disks.

                  These are the only messages that were logged before

                  the VM restarted.</div>

              </div>

            </blockquote>

            You mean you can reproduce the latency messages but they

            don't<br>

            trigger a reboot - right?<br>

            <blockquote type="cite">

              <div dir="ltr">

                <div>From the SBD documentation,  <a

                    href="https://www.mankier.com/8/sbd" target="_blank"

                    moz-do-not-send="true">https://www.mankier.com/8/sbd</a>.,

                  it says that having 1 SBD disk does not introduce a

                  single point of failure. I also tested this

                  configuration by offlining a disk and pacemaker worked

                  just fine. From your experience, is it safe to run the

                  cluster with one SBD disk? This is a 2 node Hana

                  database cluster, where one is primary. The data is

                  replicated using the native database tools. So,

                  there's no shared DB storage and the chances of a

                  split-brain scenario is less likely to occur. This is

                  because, the secondary database does not accept any

                  writes.</div>

              </div>

            </blockquote>

            When setup properly so that a node reboots if it looses<br>

            its pacemaker-partner and the disk at the same time a 2-node<br>

            cluster with SBD and a single disk should be safe to

            operate.<br>

            As you already pointed out the disk isn't a SPOF as a node

            will<br>

            still provide service as long as it sees the partner.<br>

            Stating the obvious: Using just a single disk with pacemaker<br>

            integration isn't raising the risk of split-brain but rather<br>

            raises the risk of an unneeded node-reboot. So if your setup<br>

            is likely to e.g. loose the connection between the<br>

            partner-nodes and that to the disk simultaneously it may<br>

            be interesting to have something like 3 disks a 3 sites or<br>

            step away from 2-node-config in corosync in favor of real<br>

            quorum using qdevice.<br>

            I'm not very familiar with Hana-specific issue though.<br>

            <br>

            Klaus<br>

            <blockquote type="cite">

              <div dir="ltr">

                <div>Regards,</div>

                <div>JK</div>

                <div><br>

                </div>

              </div>

              <br>

              <div class="gmail_quote">

                <div dir="ltr" class="gmail_attr">On Thu, Jan 2, 2020 at

                  6:35 PM Klaus Wenninger <<a

                    href="mailto:kwenning@redhat.com" target="_blank"

                    moz-do-not-send="true">kwenning@redhat.com</a>>

                  wrote:<br>

                </div>

                <blockquote class="gmail_quote" style="margin:0px 0px

                  0px 0.8ex;border-left:1px solid

                  rgb(204,204,204);padding-left:1ex">On 12/26/19 9:27

                  AM, Roger Zhou wrote:<br>

                  > On 12/24/19 11:48 AM, Jerry Kross wrote:<br>

                  >> Hi,<br>

                  >> The pacemaker cluster manages a 2 node

                  database cluster configured to use 3 <br>

                  >> iscsi disk targets in its stonith

                  configuration. The pacemaker cluster was put <br>

                  >> in maintenance mode but we see SBD writing to

                  the system logs. And just after <br>

                  >> these logs, the production node was

                  restarted.<br>

                  >> Log:<br>

                  >> sbd[5955]:  warning: inquisitor_child:

                  Latency: No liveness for 37 s exceeds <br>

                  >> threshold of 36 s (healthy servants: 1)<br>

                  >> I see these messages logged and then the node

                  was restarted. I suspect if it <br>

                  >> was the softdog module that restarted the

                  node but I don't see it in the logs. <br>

                  Just to understand your config ...<br>

                  You are using 3 block-devices with quorum amongst each

                  other without<br>

                  pacemaker-integration - right?<br>

                  Might be that the disk-watchers are hanging on some io

                  so that<br>

                  we don't see any logs from them.<br>

                  Did that happen just once or can you reproduce the

                  issue?<br>

                  If you are not using pacemaker-integration so far that

                  might be a<br>

                  way to increase reliability. (If it sees the other

                  node sbd would be content<br>

                  without getting response from the disks.) Of course it

                  depends on your<br>

                  distribution<br>

                  and sbd-version if that would be supported with a

                  2-node-cluster<br>

                  (or at all). sbd e.g. would have to have at least<br>

                  <a

href="https://github.com/ClusterLabs/sbd/commit/4bd0a66da3ac9c9afaeb8a2468cdd3ed51ad3377"

                    rel="noreferrer" target="_blank"

                    moz-do-not-send="true">https://github.com/ClusterLabs/sbd/commit/4bd0a66da3ac9c9afaeb8a2468cdd3ed51ad3377</a><br>

                  <br>

                  Klaus <br>

                  > sbd is too critical to share the io path with

                  others.<br>

                  ><br>

                  > Very likely, the workload is too heavy, the iscsi

                  connections are broken and <br>

                  > sbd looses the access to the disks, then sbd use

                  sysrq 'b' to reboot the node <br>

                  > brutally and immediately.<br>

                  ><br>

                  > In regarding to watchdog-reboot, it kicks in when

                  sbd is not able to tickle it <br>

                  > in time, eg. sbd starves for cpu, or is crashed.

                  It is crucial too, but not <br>

                  > likely the case here.<br>

                  ><br>

                  > Merry X'mas and Happy New Year!<br>

                  > Roger<br>

                  ><br>

                  > _______________________________________________<br>

                  > Manage your subscription:<br>

                  > <a

                    href="https://lists.clusterlabs.org/mailman/listinfo/users"

                    rel="noreferrer" target="_blank"

                    moz-do-not-send="true">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>

                  ><br>

                  > ClusterLabs home: <a

                    href="https://www.clusterlabs.org/" rel="noreferrer"

                    target="_blank" moz-do-not-send="true">https://www.clusterlabs.org/</a><br>

                  <br>

                  _______________________________________________<br>

                  Manage your subscription:<br>

                  <a

                    href="https://lists.clusterlabs.org/mailman/listinfo/users"

                    rel="noreferrer" target="_blank"

                    moz-do-not-send="true">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>

                  <br>

                  ClusterLabs home: <a

                    href="https://www.clusterlabs.org/" rel="noreferrer"

                    target="_blank" moz-do-not-send="true">https://www.clusterlabs.org/</a></blockquote>

              </div>

            </blockquote>

            <br>

          </div>

        </blockquote>

      </div>

    </blockquote>

    <br>

  </body>

</html>