<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:"MS Gothic";
panose-1:2 11 6 9 7 2 5 8 2 4;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Aptos;}
@font-face
{font-family:"Segoe UI Emoji";
panose-1:2 11 5 2 4 2 4 2 2 3;}
@font-face
{font-family:"\@MS Gothic";
panose-1:2 11 6 9 7 2 5 8 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
font-size:12.0pt;
font-family:"Aptos",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
{mso-style-priority:34;
margin-top:0cm;
margin-right:0cm;
margin-bottom:0cm;
margin-left:36.0pt;
font-size:12.0pt;
font-family:"Aptos",sans-serif;}
span.EmailStyle18
{mso-style-type:personal-reply;
font-family:"Aptos",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:1381127623;
mso-list-type:hybrid;
mso-list-template-ids:460087018 -469041502 67698691 67698693 67698689 67698691 67698693 67698689 67698691 67698693;}
@list l0:level1
{mso-level-start-at:0;
mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Wingdings;
mso-fareast-font-family:Aptos;
mso-bidi-font-family:"Times New Roman";}
@list l0:level2
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Courier New";}
@list l0:level3
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Wingdings;}
@list l0:level4
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Symbol;}
@list l0:level5
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Courier New";}
@list l0:level6
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Wingdings;}
@list l0:level7
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Symbol;}
@list l0:level8
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Courier New";}
@list l0:level9
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Wingdings;}
ol
{margin-bottom:0cm;}
ul
{margin-bottom:0cm;}
--></style>
</head>
<body lang="EN-US" link="blue" vlink="purple" style="word-wrap:break-word">
<div class="WordSection1">
<ul style="margin-top:0cm" type="disc">
<li class="MsoListParagraph" style="margin-left:0cm;mso-list:l0 level1 lfo1">But sry again I forgot to mention that the fence-resource has to be called 'watchdog' otherwise pacemaker won't align it with the already<br>
existent (if you have stonith-watchdog-timeout != 0) internal hidden device.<o:p></o:p></li></ul>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">[root@memverge ~]# pcs stonith create watchdog-fencing watchdog<o:p></o:p></p>
<p class="MsoNormal">Error: Agent 'stonith:watchdog' is not installed or does not provide valid metadata: crm_resource: Metadata query for stonith:watchdog failed: No such device or address, Error performing operation: No such object, use --force to override<o:p></o:p></p>
<p class="MsoNormal">Error: Errors have occurred, therefore pcs is unable to continue<o:p></o:p></p>
<p class="MsoNormal">[root@memverge ~]#<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<ul style="margin-top:0cm" type="disc">
<li class="MsoListParagraph" style="margin-left:0cm;mso-list:l0 level1 lfo1">Can you provide your cib & corosync-config as that we don't have to write back and forth that often?<o:p></o:p></li></ul>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">I attached it in the files.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Anton<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif"> Klaus Wenninger <kwenning@redhat.com>
<br>
<b>Sent:</b> Thursday, February 5, 2026 3:42 PM<br>
<b>To:</b> Anton Gavriliuk <Anton.Gavriliuk@hpe.ua><br>
<b>Cc:</b> Andrei Borzenkov <arvidjaar@gmail.com>; Cluster Labs - All topics related to open-source clustering welcomed <users@clusterlabs.org><br>
<b>Subject:</b> Re: [ClusterLabs] Question about two level STONITH/fencing<o:p></o:p></span></p>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div>
<p class="MsoNormal">On Thu, Feb 5, 2026 at 2:21<span style="font-family:"Arial",sans-serif"> </span>PM Anton Gavriliuk <<a href="mailto:Anton.Gavriliuk@hpe.ua">Anton.Gavriliuk@hpe.ua</a>> wrote:<o:p></o:p></p>
</div>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-right:0cm">
<div>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">I tried,<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">[root@memverge ~]# pcs stonith create watchdog-fencing fence_watchdog<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">But after that, the running cluster is hanging...., I can't run "crm_mon -Rr", “error: Lost connection to controller”<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Perhaps this is due to /dev/watchdog is already managed by pacemaker ?<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">[root@memverge ~]# systemctl status sbd<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">● sbd.service - Shared-storage based fencing daemon<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> Loaded: loaded (/usr/lib/systemd/system/sbd.service; enabled; preset: disabled)<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> Drop-In: /etc/systemd/system/sbd.service.d<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> └─override.conf<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> Active: active (running) since Tue 2026-02-03 16:09:00 EET; 1 day 22h ago<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Invocation: 11a9ba526ef5403682980d67a886a7b9<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> Docs: man:sbd(8)<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> Main PID: 2473 (sbd)<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> Tasks: 3 (limit: 3355442)<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> Memory: 18.8M (peak: 19.5M)<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> CPU: 2min 22.568s<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> CGroup: /system.slice/sbd.service<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">
<span style="font-family:"MS Gothic"">├</span>─2473 "sbd: inquisitor"<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">
<span style="font-family:"MS Gothic"">├</span>─2487 "sbd: watcher: Pacemaker"<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> └─2488 "sbd: watcher: Cluster"<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Feb 03 16:09:00 memverge sbd[2473]: notice: inquisitor_child: Servant cluster is healthy (age: 0)<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Feb 03 16:09:00 memverge sbd[2473]: notice: watchdog_init: Using watchdog device '/dev/watchdog'<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Feb 03 16:09:00 memverge systemd[1]: Started sbd.service - Shared-storage based fencing daemon.<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Feb 03 16:09:04 memverge sbd[2473]: notice: inquisitor_child: Servant pcmk is healthy (age: 0)<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Feb 03 16:11:27 memverge systemd[1]: /etc/systemd/system/sbd.service.d/override.conf:1: Assignment outside of section. Ignoring.<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Feb 03 16:11:28 memverge systemd[1]: /etc/systemd/system/sbd.service.d/override.conf:1: Assignment outside of section. Ignoring.<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Feb 03 16:25:02 memverge sbd[2473]: warning: inquisitor_child: pcmk health check: UNHEALTHY<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Feb 03 16:25:02 memverge sbd[2473]: warning: inquisitor_child: Servant pcmk is outdated (age: 1246)<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Feb 03 16:25:03 memverge sbd[2473]: notice: inquisitor_child: Servant pcmk is healthy (age: 0)<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Feb 05 15:01:05 memverge systemd[1]: /etc/systemd/system/sbd.service.d/override.conf:1: Assignment outside of section. Ignoring.<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">[root@memverge ~]#<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Oh.., now it opened,<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Cluster Summary:<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * Stack: corosync (Pacemaker is running)<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * Current DC: memverge (27) (version 3.0.1-3.el10-b1a23a6) - MIXED-VERSION partition with quorum<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * Last updated: Thu Feb 5 15:14:45 2026<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * Last change: Thu Feb 5 15:12:09 2026 by root via root on memverge<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * 2 nodes configured<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * 23 resource instances configured<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Node List:<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * Node memverge (27): online, feature set 3.20.1<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * Node memverge2 (28): online, feature set <3.15.1<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Full List of Resources:<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * Resource Group: g-nfs:<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * pb_nfs (ocf:heartbeat:portblock): Started memverge<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * ip0_nfs (ocf:heartbeat:IPaddr2): Started memverge<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * fs_nfs_internal_info_HA (ocf:heartbeat:Filesystem): Started memverge<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * fs_nfsshare_exports_HA (ocf:heartbeat:Filesystem): Started memverge<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * nfsserver (ocf:heartbeat:nfsserver): Started memverge<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * expfs_nfsshare_exports_HA (ocf:heartbeat:exportfs): Started memverge<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * samba_service (systemd:smb): Started memverge<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * fs_sambashare_exports_HA (ocf:heartbeat:Filesystem): Started memverge<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * punb_nfs (ocf:heartbeat:portblock): Started memverge<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * Resource Group: g-iscsi:<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * pb_iscsi (ocf:heartbeat:portblock): Started memverge<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * ip0_iscsi (ocf:heartbeat:IPaddr2): Started memverge<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * ip1_iscsi (ocf:heartbeat:IPaddr2): Started memverge<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * iscsi_target (ocf:heartbeat:iSCSITarget): Started memverge<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * iscsi_lun_drbd3 (ocf:heartbeat:iSCSILogicalUnit): Started memverge<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * iscsi_lun_drbd4 (ocf:heartbeat:iSCSILogicalUnit): Started memverge<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * punb_iscsi (ocf:heartbeat:portblock): Started memverge<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * Clone Set: ha-nfs-clone [ha-nfs] (promotable):<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * ha-nfs (ocf:linbit:drbd): Unpromoted memverge2<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * ha-nfs (ocf:linbit:drbd): Promoted memverge<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * Clone Set: ha-iscsi-clone [ha-iscsi] (promotable):<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * ha-iscsi (ocf:linbit:drbd): Unpromoted memverge2<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * ha-iscsi (ocf:linbit:drbd): Promoted memverge<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * ipmi-fence-memverge (stonith:fence_ipmilan): Started memverge2<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * ipmi-fence-memverge2 (stonith:fence_ipmilan): Started memverge<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * watchdog-fencing (stonith:fence_watchdog): Starting memverge2<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Failed Resource Actions:<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> * ipmi-fence-memverge_monitor_30000 on memverge2 'Error occurred' (1): call=93, status='Error', exitreason='Lost connection to fencer' * ipmi-fence-memveF<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">And there are so many records in /var/log/messages,<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Feb 5 15:13:10 memverge pacemaker-controld[755570]: notice: Fencer connection failed (will retry): Transport endpoint is not connected<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Feb 5 15:13:10 memverge pacemaker-controld[755570]: notice: Fencer connection failed (will retry): Transport endpoint is not connected<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Feb 5 15:13:10 memverge pacemaker-controld[755570]: notice: Fencer connection failed (will retry): Transport endpoint is not connected<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Feb 5 15:13:10 memverge pacemaker-controld[755570]: notice: Fencer connection failed (will retry): Transport endpoint is not connected<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Feb 5 15:13:10 memverge pacemaker-controld[755570]: notice: Fencer connection failed (will retry): Transport endpoint is not connected<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Feb 5 15:13:10 memverge pacemaker-controld[755570]: notice: Fencer connection failed (will retry): Transport endpoint is not connected<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Feb 5 15:13:10 memverge pacemaker-controld[755570]: notice: Fencer connection failed (will retry): Transport endpoint is not connected<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Feb 5 15:13:10 memverge pacemaker-controld[755570]: notice: Fencer connection failed (will retry): Transport endpoint is not connected<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Feb 5 15:13:10 memverge pacemaker-controld[755570]: notice: Fencer connection failed (will retry): Transport endpoint is not connected<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Feb 5 15:13:10 memverge pacemaker-controld[755570]: notice: Fencer connection failed (will retry): Transport endpoint is not connected<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">[root@memverge ~]#<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">I’m new in pacemaker/corosync, so it is quite complicated to me
<span style="font-family:"Segoe UI Emoji",sans-serif">😊</span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Or may be add fence_ipmilan as level 1 and don’t add sbd as level 2, assuming cluster should automatically detect it just because have-watchdog=true and fallback to sbd even without
explicit as level 2 ?<o:p></o:p></p>
</div>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">Not sure what we're seeing. The 'Fencer connection failed ...' thing would point to pacemaker-fenced having had a segfault or something.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">You might see traces of that elsewhere. And it would explain strange behavior of pacemaker in general if it is constantly trying to<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">restart pacemaker-fenced.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">But sry again I forgot to mention that the fence-resource has to be called 'watchdog' otherwise pacemaker won't align it with the already<br>
existent (if you have stonith-watchdog-timeout != 0) internal hidden device.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">If not doing so this is probably untested (Don't remember if I had tested that during development of the feature. It is definitely not a test-case<br>
for CI or something.) and might lead to pacemaker-fenced having an issue. So this should probably be fixed but if you use the correct<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">naming it should work.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">Can you provide your cib & corosync-config as that we don't have to write back and forth that often?<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">Regards,<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">Klaus <o:p></o:p></p>
</div>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-right:0cm">
<div>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Anton<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif"> Klaus Wenninger <<a href="mailto:kwenning@redhat.com" target="_blank">kwenning@redhat.com</a>>
<br>
<b>Sent:</b> Thursday, February 5, 2026 2:52 PM<br>
<b>To:</b> Anton Gavriliuk <<a href="mailto:Anton.Gavriliuk@hpe.ua" target="_blank">Anton.Gavriliuk@hpe.ua</a>><br>
<b>Cc:</b> Andrei Borzenkov <<a href="mailto:arvidjaar@gmail.com" target="_blank">arvidjaar@gmail.com</a>>; Cluster Labs - All topics related to open-source clustering welcomed <<a href="mailto:users@clusterlabs.org" target="_blank">users@clusterlabs.org</a>><br>
<b>Subject:</b> Re: [ClusterLabs] Question about two level STONITH/fencing</span><o:p></o:p></p>
</div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
</div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">On Thu, Feb 5, 2026 at 12:56<span style="font-family:"Arial",sans-serif"> </span>PM Anton Gavriliuk <<a href="mailto:Anton.Gavriliuk@hpe.ua" target="_blank">Anton.Gavriliuk@hpe.ua</a>>
wrote:<o:p></o:p></p>
</div>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt">
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><br>
Correct, in addition to two cluster nodes, there is dedicated 3rd node physical server as qdevice.<br>
<br>
I'm thinking about two level fencing topology, 1st level - fence_ipmilan, 2nd - diskless sbd (hpwdt, /dev/watchdog)<br>
<br>
But I can't add sbd as a 2nd level fencing,<br>
<br>
[root@memverge2 ~]# pcs stonith level add 2 memverge watchdog<br>
Error: Stonith resource(s) 'watchdog' do not exist, use --force to override<br>
Error: Errors have occurred, therefore pcs is unable to continue<br>
[root@memverge2 ~]#<br>
<br>
So back to the original question - what is the most correct way of implementing STONITH/fencing with fence_iomilan + diskless sbd (hpwdt, /dev/watchdog) ?<o:p></o:p></p>
</blockquote>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Sorry then that I had overlooked qdevice (actually I thought I checked for it but ...).<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">For adding the watchdog into a topology you have to make it visible before - just add it as any fencing-device with fence_watchdog as agent.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">There is a fence_watchdog script but that is just for the meta-data. Pacemaker will recognize that hand handle the actual fencing internally.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Regards,<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Klaus<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
</div>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt">
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:12.0pt"><br>
Anton<br>
<br>
<br>
-----Original Message-----<br>
From: Andrei Borzenkov <<a href="mailto:arvidjaar@gmail.com" target="_blank">arvidjaar@gmail.com</a>>
<br>
Sent: Thursday, February 5, 2026 1:17 PM<br>
To: Cluster Labs - All topics related to open-source clustering welcomed <<a href="mailto:users@clusterlabs.org" target="_blank">users@clusterlabs.org</a>><br>
Cc: Anton Gavriliuk <<a href="mailto:Anton.Gavriliuk@hpe.ua" target="_blank">Anton.Gavriliuk@hpe.ua</a>><br>
Subject: Re: [ClusterLabs] Question about two level STONITH/fencing<br>
<br>
On Thu, Feb 5, 2026 at 2:07<span style="font-family:"Arial",sans-serif"> </span>PM Klaus Wenninger <<a href="mailto:kwenning@redhat.com" target="_blank">kwenning@redhat.com</a>> wrote:<br>
><br>
><br>
><br>
> On Wed, Feb 4, 2026 at 4:36<span style="font-family:"Arial",sans-serif"> </span>PM Anton Gavriliuk via Users <<a href="mailto:users@clusterlabs.org" target="_blank">users@clusterlabs.org</a>> wrote:<br>
>><br>
>><br>
>><br>
>> Hello<br>
>><br>
>><br>
>><br>
>> There is two-node (HPE DL345 Gen12 servers) shared-nothing DRBD-based sync (Protocol C) replication, distributed active/standby pacemaker storage metro-cluster. The distributed active/standby pacemaker storage metro-cluster configured with qdevice, heuristics
(parallel fping) and fencing - fence_ipmilan and diskless sbd (hpwdt, /dev/watchdog). All cluster resources are configured to always run together on the same node.<br>
>><br>
>><br>
>><br>
>> The two storage cluster nodes and qdevice running on Rocky Linux 10.1<br>
>><br>
>> Pacemaker version 3.0.1<br>
>><br>
>> Corosync version 3.1.9<br>
>><br>
>> DRBD version 9.3.0<br>
>><br>
>><br>
>><br>
>> So, the question is – what is the most correct way of implementing STONITH/fencing with fence_iomilan + diskless sbd (hpwdt, /dev/watchdog) ?<br>
><br>
><br>
> The correct way of using diskless sbd with a two-node cluster is not <br>
> to use it ;-)<br>
><br>
> diskless sbd (watchdog-fencing) requires 'real' quorum and quorum <br>
> provided by corosync in two-node mode would introduce split-brain <br>
> which is the reason why sbd recognizes the two-node operation and <br>
> replaces quorum from corosync by the information that the peer node is currently in the cluster. This is fine for working with poison-pill fencing - a single single shared disk then doesn't become a single-point-of-failure as long as the peer is there. But
for watchdog-fencing that doesn't help because the peer going away would mean you have to commit suicide.<br>
><br>
> and alternative with a two-node cluster is to step away from the actual two-node design and go with qdevice for 'real' quorum.<br>
<br>
Hmm ... the original description does mention qdevice, although it is not quite clear where it is located (is there the third node?)<br>
<br>
> You'll need some kind of 3rd node but it doesn't have to be a full cluster node.<br>
><o:p></o:p></p>
</blockquote>
</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</body>
</html>