[ClusterLabs] Antw: [EXT] normal reboot with active sbd does not work

Zoran Bošnjak zoran.bosnjak at via.si
Fri Jun 3 09:16:19 EDT 2022


Yes, it's dell power edge. Would you know how to disable front panel indication in case of watchdog reset?

"echo V >/dev/watchdog" makes no difference.

----- Original Message -----
From: "Ulrich Windl" <Ulrich.Windl at rz.uni-regensburg.de>
To: "users" <users at clusterlabs.org>
Sent: Friday, June 3, 2022 11:00:18 AM
Subject: [ClusterLabs] Antw: [EXT] normal reboot with active sbd does not work

>>> Zoran Bošnjak <zoran.bosnjak at via.si> schrieb am 03.06.2022 um 10:18 in
Nachricht <2046503996.272.1654244336372.JavaMail.zimbra at via.si>:
> Hi all,
> I would appreciate an advice about sbd fencing (without shared storage).

Not an answer, but curiosity:
As sbd needs very little space (like just 1MB), did anybody ever try to use a
small computer like a raspberry pi to privide shared storage for SBD via iSCSI
for example?
The disk could be a partition of the flash card (it's written quite rarely).

...
> After some long timeout, it looks like the watchdog timer expires and server

> boots, but the failure indication remains on the front panel of the server.


Dell PowerEdge? ;-)

In SLES I have these (among others) settings:
SBD_WATCHDOG_DEV=/dev/watchdog
SBD_WATCHDOG_TIMEOUT=30
SBD_TIMEOUT_ACTION=flush,reboot

I did:
h16:~ # echo iTCO_wdt > /etc/modules-load.d/watchdog.conf
h16:~ # systemctl restart systemd-modules-load
h16:~ # lsmod | egrep "(wd|dog)"
iTCO_wdt               16384  0
iTCO_vendor_support    16384  1 iTCO_wdt

Later I changed it to:
h16:~ # echo ipmi_watchdog > /etc/modules-load.d/watchdog.conf
h16:~ # systemctl restart systemd-modules-load

After reboot there was a conflict:
Dec 04 12:07:22 h16 kernel: watchdog: wdat_wdt: cannot register miscdev on
minor=130 (err=-16).
Dec 04 12:07:22 h16 kernel: watchdog: wdat_wdt: a legacy watchdog module is
probably present.
h16:~ # lsmod | grep wd
wdat_wdt               20480  0
h16:~ # modprobe -r wdat_wdt
h16:~ # modprobe ipmi_watchdog
h16:~ # lsmod | grep wat
ipmi_watchdog          32768  1
ipmi_msghandler       114688  4 ipmi_devintf,ipmi_si,ipmi_watchdog,ipmi_ssif

h16:/etc/modprobe.d # cat 99-local.conf
#
# please add local extensions to this file
#
h16:/etc/modprobe.d # echo 'blacklist wdat_wdt' >> 99-local.conf

Maybe also check whether „echo V >/dev/watchdog“ will stop the watchdig
properly. SUSE (and upstream meanwhile Iguess) had to fix it.

Hope this helps a bit.

Regards,
Ulrich

> If I uninstall the 'sbd' package, the "sudo reboot" works normally again.
> 
> My question is: How do I configure the system, to have the 'sbd' function 
> present, but still be able to reboot the system normally.
> 
> regards,
> Zoran
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 



_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


More information about the Users mailing list