[ClusterLabs] Which node initiates fencing?

Wed Jul 1 03:33:48 UTC 2015

Thanks Ken,

We will do our tests.

*Jonathan Vargas Rodríguez*
Founder and Solution Engineer
Alkaid <https://alkaid.cr/> | Open Source Software

* mail *  jonathan.vargas at alkaid.cr
 telf   +506 4001 6259 Ext. 01
 mobi   +506 4001 6259 Ext. 51

<http://linkedin.com/in/jonathanvargas/>
<https://plus.google.com/+JonathanVargas/>
<https://www.facebook.com/alkaid.cr>       <https://twitter.com/alkaidcr>

2015-06-25 8:57 GMT-06:00 Ken Gaillot <kgaillot at redhat.com>:

> On 06/24/2015 06:39 PM, Jonathan Vargas wrote:
> > Thanks Ken.
> >
> > It's weird. Because we did tests and that did not happen.
> >
> > There is a node (named Z) without stonith/sbd resources assigned at all
> and
> > but it was the node that sent the fencing request to a crashed node (X).
> >
> > But this error appeared in its logs: "No route to host".
> >
> > It's obvious for us that if SBD isn't running on Z, and there is no
> network
> > access to that crashed node (X), then based on your answer, node Y which
> > really had access to X via SBD had to initiate the fencing request. But
> > this did not happen.
> >
> > In addition to this answer, I wonder if I could tell the cluster to avoid
> > sending fencing requests from specific nodes, or at the other side: Tell
> > the cluster which nodes are authorized to send fencing requests.
> >
> > Any idea?
>
> Yes, that's exactly what you have to do.
>
> By default, a cluster will be "opt-out" -- any resource can run on any
> node unless you tell it otherwise. (You can change that to "opt-in", but
> for simplicity I'll assume you're using the default.)
>
> The node that "runs" the fencing resource will monitor it, so if only
> certain nodes can monitor the device, you need location constraints. How
> you configure that depends on what tools you are using (pcs, crm or
> low-level), but it's simple, you just say "this resource has this score
> on this node". A score of -INFINITY means "never run this resource on
> this node".
>
> For fencing resources, the cluster also need to know which hosts the
> device can fence. By default the cluster will ask the fence agent by
> running its "list" command. If that's not sufficient, you can configure
> a static list of hosts that the device can fence. For details see:
>
>
> http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#_special_treatment_of_stonith_resources
>
>
> > On Jun 24, 2015 1:56 PM, "Ken Gaillot" <kgaillot at redhat.com> wrote:
> >
> >> On 06/24/2015 12:20 PM, Jonathan Vargas wrote:
> >>> Hi there,
> >>>
> >>> We have a 3-node cluster for OCFS2.
> >>>
> >>> When one of the nodes fail, it should be fenced. I noticed sometimes
> one
> >> of
> >>> them is the one who sends the fencing message to the failing node, and
> >>> sometimes it's the another.
> >>>
> >>> How the cluster decides which of the remaining active nodes will be the
> >> one
> >>> to tell the failed node to fence itself?
> >>>
> >>> Thanks.
> >>
> >> Fencing resources are assigned to a node like any other resource, even
> >> though they don't really "run" anywhere. Assuming you've configured a
> >> recurring monitor operation on the resource, that node will monitor the
> >> device to ensure it's available.
> >>
> >> Because that node has "verified" (monitored) access to the device, the
> >> cluster will prefer that node to execute the fencing if possible. So
> >> it's just whatever node happened to be assigned the fencing resource.
> >>
> >> If for any reason the node with verified access can't do it, the cluster
> >> will fall back to any other capable node.
> >>
> >>> *Jonathan Vargas Rodríguez*
> >>> Founder and Solution Engineer
> >>> Alkaid <https://alkaid.cr/> | Open Source Software
> >>>
> >>> * mail *  jonathan.vargas at alkaid.cr
> >>>  telf   +506 4001 6259 Ext. 01
> >>>  mobi   +506 4001 6259 Ext. 51
> >>>
> >>> <http://linkedin.com/in/jonathanvargas/>
> >>> <https://plus.google.com/+JonathanVargas/>
> >>> <https://www.facebook.com/alkaid.cr>       <
> https://twitter.com/alkaidcr
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20150630/65e9d0b3/attachment-0002.html>