<div dir="ltr">Thanks. I most assuredly will, but first I have to run some experiments, to get a feeling for it.</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Apr 17, 2019 at 3:56 PM digimer <<a href="mailto:lists@alteeve.ca">lists@alteeve.ca</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
  
    
  
  <div bgcolor="#FFFFFF">
    <p>Happy to help you understand, just keep asking questions. :)</p>
    <p>The point can be explained this way;</p>
    <p>* If two nodes can work without coordination, you don't need a
      cluster, just run your services everywhere. If that is not the
      case, then you require coordination. Fencing ensures that a node
      that has entered an unknown state can be forced into a known state
      (off). In this way, no action will be taken by a node unless the
      peer can be informed, or the peer is gone.<br>
    </p>
    <p>The method that a node is forced into a known state depends on
      the hardware (or infrastructure) you have in your particular
      setup. So perhaps, explain what you're nodes are built on and we
      can assist with more specific details.<br>
    </p>
    <p>digimer<br>
    </p>
    <div class="gmail-m_3193709777170650094moz-cite-prefix">On 2019-04-17 5:46 p.m., JCA wrote:<br>
    </div>
    <blockquote type="cite">
      
      <div dir="ltr">Thanks. This implies that I officially do not
        understand what it is that fencing can do for me, in my simple
        cluster. Back to the drawing board.</div>
      <br>
      <div class="gmail_quote">
        <div dir="ltr" class="gmail_attr">On Wed, Apr 17, 2019 at 3:33
          PM digimer <<a href="mailto:lists@alteeve.ca" target="_blank">lists@alteeve.ca</a>> wrote:<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
          <div bgcolor="#FFFFFF">
            <p>Fencing requires some mechanism, outside the nodes
              themselves, that can terminate the nodes. Typically, IPMI
              (iLO, iRMC, RSA, DRAC, etc) is used for this.
              Alternatively, switched PDUs are common. If you don't have
              these but do have a watchdog timer on your nodes, SBD
              (storage-based death) can work.</p>
            <p>You can use 'fence_<device> <options> -o
              status' at the command line to figure out the what will
              work with your hardware. Once you can called 'fence_foo
              ... -o status' and get the status of each node, then
              translating that into a pacemaker configuration is pretty
              simple. That's when you enable stonith. <br>
            </p>
            <p>Once stonith is setup and working in pacemaker (ie: you
              can crash a node and the peer reboots it), then you will
              go to DRBD and set 'fencing: resource-and-stonith;' (tells
              DRBD to block on communication failure with the peer and
              request a fence), and then setup the 'fence-handler
              /path/to/crm-fence-peer.sh' and 'unfence-handler
              /path/to/crm-unfence-handler.sh' (I am going from memory,
              check the man page to verify syntax). <br>
            </p>
            <p>With all this done; if either pacemaker/corosync or DRBD
              lose contact with the peer, they will block and fence.
              Only after the peer has been confirmed terminated will IO
              resume. This way, split-nodes become effectively
              impossible.</p>
            <p>digimer<br>
            </p>
            <div class="gmail-m_3193709777170650094gmail-m_-5179552301465381124moz-cite-prefix">On
              2019-04-17 5:17 p.m., JCA wrote:<br>
            </div>
            <blockquote type="cite">
              <div dir="ltr">
                <div dir="ltr">
                  <div dir="ltr">Here is what I did:
                    <div><br>
                    </div>
                    <div>
                      <div># pcs stonith create disk_fencing fence_scsi
                        pcmk_host_list="one two"
                        pcmk_monitor_action="metadata"
                        pcmk_reboot_action="off"
                        devices="/dev/disk/by-id/ata-VBOX_HARDDISK_VBaaa429e4-514e8ecb"
                        meta provides="unfencing"</div>
                    </div>
                    <div><br>
                    </div>
                    <div>where ata-VBOX-... corresponds to the device
                      where I have the partition that is shared between
                      both nodes in my cluster. The command completes
                      without any errors (that I can see) and after that
                      I have</div>
                    <div><br>
                    </div>
                    <div>
                      <div># pcs status</div>
                      <div>Cluster name: ClusterOne</div>
                      <div>Stack: corosync</div>
                      <div>Current DC: one (version
                        1.1.19-8.el7_6.4-c3c624ea3d) - partition with
                        quorum</div>
                      <div>Last updated: Wed Apr 17 14:35:25 2019</div>
                      <div>Last change: Wed Apr 17 14:11:14 2019 by root
                        via cibadmin on one</div>
                      <div><br>
                      </div>
                      <div>2 nodes configured</div>
                      <div>5 resources configured</div>
                      <div><br>
                      </div>
                      <div>Online: [ one two ]</div>
                      <div><br>
                      </div>
                      <div>Full list of resources:</div>
                      <div><br>
                      </div>
                      <div> MyCluster<span style="white-space:pre-wrap"> </span>(ocf::myapp:myapp-script):<span style="white-space:pre-wrap">      </span>Stopped</div>
                      <div> Master/Slave Set: DrbdDataClone [DrbdData]</div>
                      <div>     Stopped: [ one two ]</div>
                      <div> DrbdFS<span style="white-space:pre-wrap">    </span>(ocf::heartbeat:Filesystem):<span style="white-space:pre-wrap">    </span>Stopped</div>
                      <div> disk_fencing <span style="white-space:pre-wrap">    </span>(stonith:fence_scsi):<span style="white-space:pre-wrap">   </span>Stopped</div>
                      <div><br>
                      </div>
                      <div>Daemon Status:</div>
                      <div>  corosync: active/enabled</div>
                      <div>  pacemaker: active/enabled</div>
                      <div>  pcsd: active/enabled</div>
                    </div>
                    <div><br>
                    </div>
                    <div>Things stay that way indefinitely, until I set
                      stonith-enabled to false - at which point all the
                      resources above get started immediately.</div>
                    <div><br>
                    </div>
                    <div>Obviously, I am missing something big here.
                      But, what is it?</div>
                    <div><br>
                    </div>
                  </div>
                </div>
              </div>
              <br>
              <div class="gmail_quote">
                <div dir="ltr" class="gmail_attr">On Wed, Apr 17, 2019
                  at 2:59 PM Adam Budziński <<a href="mailto:budzinski.adam@gmail.com" target="_blank">budzinski.adam@gmail.com</a>>
                  wrote:<br>
                </div>
                <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                  <div dir="auto">You did not configure any fencing
                    device.</div>
                  <br>
                  <div class="gmail_quote">
                    <div dir="ltr" class="gmail_attr">śr., 17.04.2019,
                      22:51 użytkownik JCA <<a href="mailto:1.41421@gmail.com" target="_blank">1.41421@gmail.com</a>>
                      napisał:<br>
                    </div>
                    <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                      <div dir="ltr">
                        <div dir="ltr">I am trying to get fencing
                          working, as described in the "Cluster from
                          Scratch" guide, and I am stymied at get-go :-(
                          <div><br>
                          </div>
                          <div>The document mentions a property named
                            stonith-enabled. When I was trying to get my
                            first cluster going, I noticed that my
                            resources would start only when this
                            property is set to false, by means of </div>
                          <div><br>
                          </div>
                          <div>    # pcs property set
                            stonith-enabled=false<br>
                          </div>
                          <div><br>
                          </div>
                          <div>Otherwise, all the resources remain
                            stopped.</div>
                          <div><br>
                          </div>
                          <div>I created a fencing resource for the
                            partition that I am sharing across the the
                            nodes, by means of DRBD. This works fine -
                            but I still have the same problem as above -
                            i.e. when stonith-enabled is set to true,
                            all the resources get stopped, and remain in
                            that state.</div>
                          <div><br>
                          </div>
                          <div>I am very confused here. Can anybody
                            point me in the right direction out of this
                            conundrum?</div>
                          <div><br>
                          </div>
                          <div><br>
                          </div>
                          <div><br>
                          </div>
                        </div>
                      </div>
                      _______________________________________________<br>
                      Manage your subscription:<br>
                      <a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>
                      <br>
                      ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer noreferrer" target="_blank">https://www.clusterlabs.org/</a></blockquote>
                  </div>
                  _______________________________________________<br>
                  Manage your subscription:<br>
                  <a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>
                  <br>
                  ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a></blockquote>
              </div>
              <br>
              <fieldset class="gmail-m_3193709777170650094gmail-m_-5179552301465381124mimeAttachmentHeader"></fieldset>
              <pre class="gmail-m_3193709777170650094gmail-m_-5179552301465381124moz-quote-pre">_______________________________________________
Manage your subscription:
<a class="gmail-m_3193709777170650094gmail-m_-5179552301465381124moz-txt-link-freetext" href="https://lists.clusterlabs.org/mailman/listinfo/users" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a>

ClusterLabs home: <a class="gmail-m_3193709777170650094gmail-m_-5179552301465381124moz-txt-link-freetext" href="https://www.clusterlabs.org/" target="_blank">https://www.clusterlabs.org/</a></pre>
            </blockquote>
          </div>
        </blockquote>
      </div>
    </blockquote>
  </div>

</blockquote></div>