[ClusterLabs Developers] How to implement fencing agent with no associated hardware device with Pacemaker?

Philippe M Stedman pmstedma at us.ibm.com
Thu Jul 30 19:17:17 UTC 2020


Thanks Gerry.

Hi Reid,

The shared storage solution we are using has clustering capabilities of its
own built into it and is able to remotely fence off the lost node, all we
need to do is run the command to expel/fence the lost node as part of our
own custom fencing agent on the surviving node.

FYI, the shared storage solution I am referring to here is IBM Spectrum
Scale.

Thanks,

Phil Stedman
Db2 High Availability Development and Support
Email: pmstedma at us.ibm.com



From:	Gerry R Sommerville/Markham/IBM
To:	nwahl at redhat.com
Cc:	developers at clusterlabs.org, Toby Haynes/Toronto/IBM at IBMCA, Alan
            Y Lee/Toronto/IBM at IBMCA, Philippe M Stedman/Silicon
            Valley/IBM at IBMUS
Date:	07/30/2020 11:55 AM
Subject:	Re: [EXTERNAL] Re: [ClusterLabs Developers] How to implement
            fencing agent with no associated hardware device with
            Pacemaker?



Seems like we lost Phil in the first reply.... Adding him back.

Gerry Sommerville
Db2 Development, pureScale Domain
E-mail: gerry at ca.ibm.com


 ----- Original message -----
 From: Reid Wahl <nwahl at redhat.com>
 To: developers at clusterlabs.org
 Cc: Toby Haynes <thaynes at ca.ibm.com>, Gerry R Sommerville
 <gerry at ca.ibm.com>, Alan Y Lee <ykalee at ca.ibm.com>
 Subject: [EXTERNAL] Re: [ClusterLabs Developers] How to implement fencing
 agent with no associated hardware device with Pacemaker?
 Date: Wed, Jul 29, 2020 11:42 PM

 I didn't see the phrase "how to develop" until after I sent the previous
 message. What is the reason for needing to develop a custom fencing agent?
 An already-built one might save you some work.

 Basically, you need some reliable method to cut off an unhealthy node's
 access to shared storage, without depending on that node being responsive.
 So for example, anything that involves logging into the failed node is
 unreliable.

 On Wed, Jul 29, 2020 at 8:40 PM Reid Wahl <nwahl at redhat.com> wrote:
  If you have a hardware **watchdog timer**, then sbd is a good option.
  With shared storage, you can also implement fence_sbd.

  KVM virtual machines also offer an emulated hardware watchdog. I'm not
  sure whether that would fit your criteria or not -- it depends on whether
  you're only excluding a management interface like an iLO/IMM, or whether
  you're also excluding a watchdog timer.

  If you can't use sbd or conventional power fencing (e.g., fence_ipmilan),
  then you may be able to use fence_scsi or fence_mpath since you have
  shared storage.

  What hardware or virtualization platform are you running on, and is there
  a particular reason you don't want to associate fencing with a hardware
  device?

  On Wed, Jul 29, 2020 at 8:34 PM Philippe M Stedman <pmstedma at us.ibm.com>
  wrote:
    Hi ClusterLabs developers,

    I am looking into how to develop a fencing agent for Pacemaker that is
    not associated to any underlying hardware device. In our case we have
    two servers (we will expand to more in the future) which have access to
    shared storage. When one of the two nodes fails, we expect the
    surviving node to invoke our user-defined fencing agent and run a
    series of commands which will "expel" the lost host from accessing
    shared storage.

    Do you have any advice on how to go about implementing such a solution?
    All the examples I can find online revolve around using some sort of
    underlying hardware device to implement fencing.

    Help is greatly appreciated.

    Thanks,

    Phil Stedman
    Db2 High Availability Development and Support
    Email: pmstedma at us.ibm.com
    _______________________________________________
    Manage your subscription:
    https://lists.clusterlabs.org/mailman/listinfo/developers

    ClusterLabs home: https://www.clusterlabs.org/


  --
  Regards,

  Reid Wahl, RHCA
  Software Maintenance Engineer, Red Hat
  CEE - Platform Support Delivery - ClusterHA


 --
 Regards,

 Reid Wahl, RHCA
 Software Maintenance Engineer, Red Hat
 CEE - Platform Support Delivery - ClusterHA


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/developers/attachments/20200730/bc6dffaa/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/developers/attachments/20200730/bc6dffaa/attachment-0002.gif>


More information about the Developers mailing list