[ClusterLabs] fencing configuration

Tue Jun 7 04:26:47 EDT 2022

Hi, I need some help with correct fencing configuration in 5-node cluster.

The speciffic issue is that there are 3 rooms, where in addition to node failure scenario, each room can fail too (for example in case of room power failure or room network failure).

room0: [ node0 ]
roomA: [ node1, node2 ]
roomB: [ node3, node4 ]

- ipmi board is present on each node
- watchdog timer is available
- shared storage is not available

Please advice, what would be a proper fencing configuration in this case.

The intention is to configure ipmi fencing (using "fence_idrac" agent) plus watchdog timer as a fallback. In other words, I would like to tell the pacemaker: "If fencing is required, try to fence via ipmi. In case of ipmi fence failure, after some timeout assume watchdog has rebooted the node, so it is safe to proceed, as if the (self)fencing had succeeded)."

>From the documentation is not clear to me whether this would be:
a) multiple fencing where ipmi would be first level and sbd would be a second level fencing (where sbd always succeeds)
b) or this is considered a single level fencing with a timeout

I have tried to followed option b) and create stonith resource for each node and setup the stonith-watchdog-timeout, like this:

---
# for each node... [0..4]
export name=...
export ip=...
export password=...
sudo pcs stonith create "fence_ipmi_$name" fence_idrac \
    lanplus=1 ip="$ip" \
    username="admin"  password="$password" \
    pcmk_host_list="$name" op monitor interval=10m timeout=10s

sudo pcs property set stonith-watchdog-timeout=20

# start dummy resource
sudo pcs resource create dummy ocf:heartbeat:Dummy op monitor interval=30s
---

I am not sure if additional location constraints have to be specified for stonith resources. For example: I have noticed that pacemaker will start a stonith resource on the same node as the fencing target. Is this OK? 

Should there be any location constraints regarding fencing and rooms?

'sbd' is running, properties are as follows:

---
$ sudo pcs property show
Cluster Properties:
 cluster-infrastructure: corosync
 cluster-name: debian
 dc-version: 2.0.3-4b1f869f0f
 have-watchdog: true
 last-lrm-refresh: 1654583431
 stonith-enabled: true
 stonith-watchdog-timeout: 20
---

Ipmi fencing (when the ipmi connection is alive) works correctly for each node. The watchdog timer also seems to be working correctly. The problem is that dummy resource is not restarted as expected.

In the test scenario, the dummy resource is currently running on node1. I have simulated node failure by unplugging the ipmi AND host network interfaces from node1. The result was that node1 gets rebooted (by watchdog), but the rest of the pacemaker cluster was unable to fence node1 (this is expected, since node1's ipmi is not accessible). The problem is that dummy resource remains stopped and node1 unclean. I was expecting that stonith-watchdog-timeout kicks in, so that dummy resource gets restarted on some other node which has quorum. 

Obviously there is something wrong with my configuration, since this seems to be a reasonably simple scenario for the pacemaker. Appreciate your help.

regards,
Zoran