[ClusterLabs] [External] : Re: Fence Agent tests

Valentin Vidić vvidic at valentin-vidic.from.hr
Sat Nov 5 15:53:09 EDT 2022


On Sat, Nov 05, 2022 at 06:47:59PM +0000, Robert Hayden wrote:
> That was my impression as well...so I may have something wrong.  My expectation was that SBD daemon
> should be writing to the /dev/watchdog within 20 seconds and the kernel watchdog would self fence.

I don't see anything unusual in the config except that pacemaker mode is
also enabled. This means that the cluster is providing signal for sbd even
when the storage device is down, for example:

883 ?        SL     0:00 sbd: inquisitor
892 ?        SL     0:00  \_ sbd: watcher: /dev/vdb1 - slot: 0 - uuid: 18b958fa-fdae-455a-aa9d-a204a6eed04b
893 ?        SL     0:00  \_ sbd: watcher: Pacemaker
894 ?        SL     0:00  \_ sbd: watcher: Cluster

You can strace different sbd processes to see what they are doing at any point.

Easy way to test if watchdog is working is to pause all sbd processes, for example:

# pkill -STOP sbd

For me this causes a node reset after 5 seconds as defined by: SBD_WATCHDOG_TIMEOUT=5

-- 
Valentin


More information about the Users mailing list