[ClusterLabs] Which node initiates fencing?

Mon Jul 20 01:15:43 CEST 2015

> On 25 Jun 2015, at 9:39 am, Jonathan Vargas <jonathan.vargas at alkaid.cr> wrote:
> 
> Thanks Ken.
> 
> It's weird. Because we did tests and that did not happen.
> 
> There is a node (named Z) without stonith/sbd resources assigned at all and but it was the node that sent the fencing request to a crashed node (X).
> 
> But this error appeared in its logs: "No route to host".
> 
> It's obvious for us that if SBD isn't running on Z,

SBD must be running everywhere.
It should be started outside of the cluster.

The fencing agent not needed on RHEL) only serves to a) do a health check of the “device” and b) to trigger fencing when necessary.

> and there is no network access to that crashed node (X), then based on your answer, node Y which really had access to X via SBD had to initiate the fencing request. But this did not happen.
> 
> In addition to this answer, I wonder if I could tell the cluster to avoid sending fencing requests from specific nodes, or at the other side: Tell the cluster which nodes are authorized to send fencing requests.
> 
> Any idea?
> 
> On Jun 24, 2015 1:56 PM, "Ken Gaillot" <kgaillot at redhat.com> wrote:
> On 06/24/2015 12:20 PM, Jonathan Vargas wrote:
> > Hi there,
> >
> > We have a 3-node cluster for OCFS2.
> >
> > When one of the nodes fail, it should be fenced. I noticed sometimes one of
> > them is the one who sends the fencing message to the failing node, and
> > sometimes it's the another.
> >
> > How the cluster decides which of the remaining active nodes will be the one
> > to tell the failed node to fence itself?
> >
> > Thanks.
> 
> Fencing resources are assigned to a node like any other resource, even
> though they don't really "run" anywhere. Assuming you've configured a
> recurring monitor operation on the resource, that node will monitor the
> device to ensure it's available.
> 
> Because that node has "verified" (monitored) access to the device, the
> cluster will prefer that node to execute the fencing if possible. So
> it's just whatever node happened to be assigned the fencing resource.
> 
> If for any reason the node with verified access can't do it, the cluster
> will fall back to any other capable node.
> 
> > *Jonathan Vargas Rodríguez*
> > Founder and Solution Engineer
> > Alkaid <https://alkaid.cr/> | Open Source Software
> >
> > * mail *  jonathan.vargas at alkaid.cr
> >  telf   +506 4001 6259 Ext. 01
> >  mobi   +506 4001 6259 Ext. 51
> >
> > <http://linkedin.com/in/jonathanvargas/>
> > <https://plus.google.com/+JonathanVargas/>
> > <https://www.facebook.com/alkaid.cr>       <https://twitter.com/alkaidcr>
> 
> 
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org