<div dir="ltr">Thanks. I most assuredly will, but first I have to run some experiments, to get a feeling for it.</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Apr 17, 2019 at 3:56 PM digimer <<a href="mailto:lists@alteeve.ca">lists@alteeve.ca</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<p>Happy to help you understand, just keep asking questions. :)</p>
<p>The point can be explained this way;</p>
<p>* If two nodes can work without coordination, you don't need a
cluster, just run your services everywhere. If that is not the
case, then you require coordination. Fencing ensures that a node
that has entered an unknown state can be forced into a known state
(off). In this way, no action will be taken by a node unless the
peer can be informed, or the peer is gone.<br>
</p>
<p>The method that a node is forced into a known state depends on
the hardware (or infrastructure) you have in your particular
setup. So perhaps, explain what you're nodes are built on and we
can assist with more specific details.<br>
</p>
<p>digimer<br>
</p>
<div class="gmail-m_3193709777170650094moz-cite-prefix">On 2019-04-17 5:46 p.m., JCA wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Thanks. This implies that I officially do not
understand what it is that fencing can do for me, in my simple
cluster. Back to the drawing board.</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Wed, Apr 17, 2019 at 3:33
PM digimer <<a href="mailto:lists@alteeve.ca" target="_blank">lists@alteeve.ca</a>> wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<p>Fencing requires some mechanism, outside the nodes
themselves, that can terminate the nodes. Typically, IPMI
(iLO, iRMC, RSA, DRAC, etc) is used for this.
Alternatively, switched PDUs are common. If you don't have
these but do have a watchdog timer on your nodes, SBD
(storage-based death) can work.</p>
<p>You can use 'fence_<device> <options> -o
status' at the command line to figure out the what will
work with your hardware. Once you can called 'fence_foo
... -o status' and get the status of each node, then
translating that into a pacemaker configuration is pretty
simple. That's when you enable stonith. <br>
</p>
<p>Once stonith is setup and working in pacemaker (ie: you
can crash a node and the peer reboots it), then you will
go to DRBD and set 'fencing: resource-and-stonith;' (tells
DRBD to block on communication failure with the peer and
request a fence), and then setup the 'fence-handler
/path/to/crm-fence-peer.sh' and 'unfence-handler
/path/to/crm-unfence-handler.sh' (I am going from memory,
check the man page to verify syntax). <br>
</p>
<p>With all this done; if either pacemaker/corosync or DRBD
lose contact with the peer, they will block and fence.
Only after the peer has been confirmed terminated will IO
resume. This way, split-nodes become effectively
impossible.</p>
<p>digimer<br>
</p>
<div class="gmail-m_3193709777170650094gmail-m_-5179552301465381124moz-cite-prefix">On
2019-04-17 5:17 p.m., JCA wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">Here is what I did:
<div><br>
</div>
<div>
<div># pcs stonith create disk_fencing fence_scsi
pcmk_host_list="one two"
pcmk_monitor_action="metadata"
pcmk_reboot_action="off"
devices="/dev/disk/by-id/ata-VBOX_HARDDISK_VBaaa429e4-514e8ecb"
meta provides="unfencing"</div>
</div>
<div><br>
</div>
<div>where ata-VBOX-... corresponds to the device
where I have the partition that is shared between
both nodes in my cluster. The command completes
without any errors (that I can see) and after that
I have</div>
<div><br>
</div>
<div>
<div># pcs status</div>
<div>Cluster name: ClusterOne</div>
<div>Stack: corosync</div>
<div>Current DC: one (version
1.1.19-8.el7_6.4-c3c624ea3d) - partition with
quorum</div>
<div>Last updated: Wed Apr 17 14:35:25 2019</div>
<div>Last change: Wed Apr 17 14:11:14 2019 by root
via cibadmin on one</div>
<div><br>
</div>
<div>2 nodes configured</div>
<div>5 resources configured</div>
<div><br>
</div>
<div>Online: [ one two ]</div>
<div><br>
</div>
<div>Full list of resources:</div>
<div><br>
</div>
<div> MyCluster<span style="white-space:pre-wrap"> </span>(ocf::myapp:myapp-script):<span style="white-space:pre-wrap"> </span>Stopped</div>
<div> Master/Slave Set: DrbdDataClone [DrbdData]</div>
<div> Stopped: [ one two ]</div>
<div> DrbdFS<span style="white-space:pre-wrap"> </span>(ocf::heartbeat:Filesystem):<span style="white-space:pre-wrap"> </span>Stopped</div>
<div> disk_fencing <span style="white-space:pre-wrap"> </span>(stonith:fence_scsi):<span style="white-space:pre-wrap"> </span>Stopped</div>
<div><br>
</div>
<div>Daemon Status:</div>
<div> corosync: active/enabled</div>
<div> pacemaker: active/enabled</div>
<div> pcsd: active/enabled</div>
</div>
<div><br>
</div>
<div>Things stay that way indefinitely, until I set
stonith-enabled to false - at which point all the
resources above get started immediately.</div>
<div><br>
</div>
<div>Obviously, I am missing something big here.
But, what is it?</div>
<div><br>
</div>
</div>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Wed, Apr 17, 2019
at 2:59 PM Adam Budziński <<a href="mailto:budzinski.adam@gmail.com" target="_blank">budzinski.adam@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="auto">You did not configure any fencing
device.</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">śr., 17.04.2019,
22:51 użytkownik JCA <<a href="mailto:1.41421@gmail.com" target="_blank">1.41421@gmail.com</a>>
napisał:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div dir="ltr">I am trying to get fencing
working, as described in the "Cluster from
Scratch" guide, and I am stymied at get-go :-(
<div><br>
</div>
<div>The document mentions a property named
stonith-enabled. When I was trying to get my
first cluster going, I noticed that my
resources would start only when this
property is set to false, by means of </div>
<div><br>
</div>
<div> # pcs property set
stonith-enabled=false<br>
</div>
<div><br>
</div>
<div>Otherwise, all the resources remain
stopped.</div>
<div><br>
</div>
<div>I created a fencing resource for the
partition that I am sharing across the the
nodes, by means of DRBD. This works fine -
but I still have the same problem as above -
i.e. when stonith-enabled is set to true,
all the resources get stopped, and remain in
that state.</div>
<div><br>
</div>
<div>I am very confused here. Can anybody
point me in the right direction out of this
conundrum?</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
</div>
</div>
_______________________________________________<br>
Manage your subscription:<br>
<a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>
<br>
ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer noreferrer" target="_blank">https://www.clusterlabs.org/</a></blockquote>
</div>
_______________________________________________<br>
Manage your subscription:<br>
<a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>
<br>
ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a></blockquote>
</div>
<br>
<fieldset class="gmail-m_3193709777170650094gmail-m_-5179552301465381124mimeAttachmentHeader"></fieldset>
<pre class="gmail-m_3193709777170650094gmail-m_-5179552301465381124moz-quote-pre">_______________________________________________
Manage your subscription:
<a class="gmail-m_3193709777170650094gmail-m_-5179552301465381124moz-txt-link-freetext" href="https://lists.clusterlabs.org/mailman/listinfo/users" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a>
ClusterLabs home: <a class="gmail-m_3193709777170650094gmail-m_-5179552301465381124moz-txt-link-freetext" href="https://www.clusterlabs.org/" target="_blank">https://www.clusterlabs.org/</a></pre>
</blockquote>
</div>
</blockquote>
</div>
</blockquote>
</div>
</blockquote></div>