[ClusterLabs Developers] Question about resource level fencing with Pacemaker

Tue Aug 25 12:35:55 EDT 2020

On Mon, 2020-08-24 at 13:06 -0700, Philippe M Stedman wrote:
> Hi,
> 
> I have a question about resource level fencing. I read about it in
> the following document::
> https://clusterlabs.org/pacemaker/doc/crm_fencing.html

FYI that document is very old and really needs updating, though the
basic info is still relevant. The fencing chapter of "Pacemaker
Explained" is up to date:

https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html-single/Pacemaker_Explained/index.html#idm47160758614368

The obsolete info in crm_fencing.html is:

* "stonith_admin" is the current equivalent of the long-gone "stonith"
command

* stonith resources should not be cloned (stonith devices are now
accessible from all nodes without being cloned)

* It only mentions heartbeat-style ("external") fence agents; pacemaker
now supports RHCS-style ("fence_*") as well

* The resource-level fencing it describes is still possible but
generally implemented differently

Ideally I'd like to drop crm_fencing.html, merging any useful info from
it into the fencing chapter. And add something about watchdog/disk-
based fencing with sbd, which neither document discusses currently.

> It has the following excerpt about resource level fencing:
> There are two kinds of fencing: resource level and node level.
> Using the resource level fencing the cluster can make sure that a
> node cannot access one or more resources. One typical example is a
> SAN, where a fencing operation changes rules on a SAN switch to deny
> access from a node.
> The resource level fencing may be achieved using normal resources on
> which the resource we want to protect would depend. Such a resource
> would simply refuse to start on this node and therefore resources
> which depend on it will be unrunnable on the same node as well.
> 
> based on this description it is still unclear to me how I would go
> about implementing resource level fencing in my own cluster. I tried
> researching the topic for examples, but could not find much on the
> subject.
> 
> 
> Could you please shed some light on how I could implement resource
> level fencing in my own cluster? and how it differentiates from
> traditional STONITH device fencing? If you had an example of resource
> level fencing, that would be very helpful.

There are two ways to implement resource-level fencing as conceived by
that document.

The first, the example given of cutting off SAN access, is not
considered resource-level fencing anymore but uses normal node-level
fencing agents for the purpose (collectively described as "fabric
fencing"). For example there is fence_scsi to cut off disk access, and
fence_snmp to cut off access to a network switch. These are configured
like any other node-level fence device.

The second approach is to use a special resource agent that effectively
gates access to some essential resource, and colocate other resources
with that one. For example there is an ocf:heartbeat:sg_persist
resource agent where the master role holds a scsi reservation -- of
particular importance is that only one node can successfully do this.
The other resources can be colocated with the master role of that
resource to ensure they can't run unless scsi access is available.

In a modern cluster, the first approach is preferred, and even if the
second approach is used, some sort of node-level fencing should be
configured as well (otherwise it relies on the cluster functioning
properly on a node that should stop resources).

> Thanks,
> 
> Phil Stedman
> Db2 High Availability Development and Support
> Email: pmstedma at us.ibm.com
-- 
Ken Gaillot <kgaillot at redhat.com>