[ClusterLabs] Question about two level STONITH/fencing

Anton Gavriliuk Anton.Gavriliuk at hpe.ua
Mon Feb 2 11:29:45 UTC 2026


Hello

There is two-node (HPE DL345 Gen12 servers) shared-nothing DRBD-based sync (Protocol C) replication, distributed active/standby pacemaker storage metro-cluster. The distributed active/standby pacemaker storage metro-cluster configured with qdevice, heuristics (parallel fping) and fencing - fence_ipmilan and diskless sbd (hpwdt, /dev/watchdog). All cluster resources are configured to always run together on the same node.

The two storage cluster nodes and qdevice running on Rocky Linux 10.1
Pacemaker version 3.0.1
Corosync version 3.1.9
DRBD version 9.3.0

So, the question is - what is the most correct way of implementing STONITH/fencing with fence_iomilan + diskless sbd (hpwdt, /dev/watchdog) ?
I'm not sure about two-level fencing topology, because diskless sbd is not an external agent/resource...

Currently it works without fencing topology, and both running in "parallel".  Really no matter who wins.  I just want to make sure fenced node is powered off of rebooted.

Here is log how it works now in "parallel",

[root at memverge2 ~]# cat /var/log/messages|grep -i fence
Feb  2 12:46:07 memverge2 pacemaker-fenced[3902]: notice: Node memverge state is now lost
Feb  2 12:46:07 memverge2 pacemaker-fenced[3902]: notice: Removed 1 inactive node with cluster layer ID 27 from the membership cache
Feb  2 12:46:10 memverge2 pacemaker-schedulerd[3905]: warning: Cluster node memverge will be fenced: peer is no longer part of the cluster
Feb  2 12:46:10 memverge2 pacemaker-schedulerd[3905]: warning: ipmi-fence-memverge2_stop_0 on memverge is unrunnable (node is offline)
Feb  2 12:46:10 memverge2 pacemaker-schedulerd[3905]: warning: ipmi-fence-memverge2_stop_0 on memverge is unrunnable (node is offline)
Feb  2 12:46:10 memverge2 pacemaker-schedulerd[3905]: notice: Actions: Fence (reboot) memverge 'peer is no longer part of the cluster'
Feb  2 12:46:10 memverge2 pacemaker-schedulerd[3905]: notice: Actions: Stop       ipmi-fence-memverge2        (                         memverge )  due to node availability
Feb  2 12:46:10 memverge2 pacemaker-fenced[3902]: notice: Client pacemaker-controld.3906 wants to fence (reboot) memverge using any device
Feb  2 12:46:10 memverge2 pacemaker-fenced[3902]: notice: Requesting peer fencing (reboot) targeting memverge
Feb  2 12:46:10 memverge2 pacemaker-fenced[3902]: notice: Requesting that memverge2 perform 'reboot' action targeting memverge
Feb  2 12:46:10 memverge2 pacemaker-fenced[3902]: notice: Waiting 25s for memverge to self-fence (reboot) for client pacemaker-controld.3906
Feb  2 12:46:10 memverge2 pacemaker-fenced[3902]: notice: Delaying 'reboot' action targeting memverge using ipmi-fence-memverge for 5s
Feb  2 12:46:36 memverge2 pacemaker-fenced[3902]: notice: Self-fencing (reboot) by memverge for pacemaker-controld.3906 assumed complete
Feb  2 12:46:36 memverge2 pacemaker-fenced[3902]: notice: Operation 'reboot' targeting memverge by memverge2 for pacemaker-controld.3906 at memverge2<mailto:pacemaker-controld.3906 at memverge2>: OK (Done)
Feb  2 12:46:36 memverge2 kernel: drbd ha-nfs memverge: helper command: /sbin/drbdadm fence-peer
Feb  2 12:46:36 memverge2 kernel: drbd ha-iscsi memverge: helper command: /sbin/drbdadm fence-peer
Feb  2 12:46:36 memverge2 crm-fence-peer.9.sh[7332]: DRBD_BACKING_DEV_1=/dev/mapper/object_block_nfs_vg-ha_nfs_exports_lv_with_vdo_1x8 DRBD_BACKING_DEV_2=/dev/mapper/object_block_nfs_vg-ha_nfs_internal_lv_without_vdo DRBD_BACKING_DEV_5=/dev/mapper/object_block_nfs_vg-ha_samba_exports_lv_with_vdo_1x8 DRBD_CONF=/etc/drbd.conf DRBD_CSTATE=Connecting DRBD_LL_DISK=/dev/mapper/object_block_nfs_vg-ha_nfs_exports_lv_with_vdo_1x8\ /dev/mapper/object_block_nfs_vg-ha_nfs_internal_lv_without_vdo\ /dev/mapper/object_block_nfs_vg-ha_samba_exports_lv_with_vdo_1x8 DRBD_MINOR=1\ 2\ 5 DRBD_MINOR_1=1 DRBD_MINOR_2=2 DRBD_MINOR_5=5 DRBD_MY_ADDRESS=192.168.0.8 DRBD_MY_AF=ipv4 DRBD_MY_NODE_ID=28 DRBD_NODE_ID_27=memverge DRBD_NODE_ID_28=memverge2 DRBD_PEER_ADDRESS=192.168.0.6 DRBD_PEER_AF=ipv4 DRBD_PEER_NODE_ID=27 DRBD_RESOURCE=ha-nfs DRBD_VOLUME=1\ 2\ 5 UP_TO_DATE_NODES=0x10000000 /usr/lib/drbd/crm-fence-peer.9.sh
Feb  2 12:46:36 memverge2 crm-fence-peer.9.sh[7333]: DRBD_BACKING_DEV_3=/dev/mapper/object_block_nfs_vg-ha_block_exports_lv_with_vdo_1x8 DRBD_BACKING_DEV_4=/dev/mapper/object_block_nfs_vg-ha_block_exports_lv_without_vdo DRBD_CONF=/etc/drbd.conf DRBD_CSTATE=Connecting DRBD_LL_DISK=/dev/mapper/object_block_nfs_vg-ha_block_exports_lv_with_vdo_1x8\ /dev/mapper/object_block_nfs_vg-ha_block_exports_lv_without_vdo DRBD_MINOR=3\ 4 DRBD_MINOR_3=3 DRBD_MINOR_4=4 DRBD_MY_ADDRESS=192.168.0.8 DRBD_MY_AF=ipv4 DRBD_MY_NODE_ID=28 DRBD_NODE_ID_27=memverge DRBD_NODE_ID_28=memverge2 DRBD_PEER_ADDRESS=192.168.0.6 DRBD_PEER_AF=ipv4 DRBD_PEER_NODE_ID=27 DRBD_RESOURCE=ha-iscsi DRBD_VOLUME=3\ 4 UP_TO_DATE_NODES=0x10000000 /usr/lib/drbd/crm-fence-peer.9.sh
Feb  2 12:46:36 memverge2 crm-fence-peer.9.sh[7333]: INFO Concurrency check: Peer is already marked clean/fenced by another resource. Returning success (Exit 4).
Feb  2 12:46:36 memverge2 crm-fence-peer.9.sh[7332]: INFO Concurrency check: Peer is already marked clean/fenced by another resource. Returning success (Exit 4).
Feb  2 12:46:36 memverge2 kernel: drbd ha-iscsi memverge: helper command: /sbin/drbdadm fence-peer exit code 4 (0x400)
Feb  2 12:46:36 memverge2 kernel: drbd ha-iscsi memverge: fence-peer helper returned 4 (peer was fenced)
Feb  2 12:46:36 memverge2 kernel: drbd ha-nfs memverge: helper command: /sbin/drbdadm fence-peer exit code 4 (0x400)
Feb  2 12:46:36 memverge2 kernel: drbd ha-nfs memverge: fence-peer helper returned 4 (peer was fenced)
Feb  2 12:46:37 memverge2 pacemaker-fenced[3902]: notice: Operation 'reboot' [7068] targeting memverge using ipmi-fence-memverge returned 0
Feb  2 12:46:37 memverge2 pacemaker-fenced[3902]: notice: Operation 'reboot' targeting memverge by memverge2 for pacemaker-controld.3906 at memverge2<mailto:pacemaker-controld.3906 at memverge2>: Result arrived too late
[root at memverge2 ~]#

Anton

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20260202/b76a4324/attachment.htm>


More information about the Users mailing list