[ClusterLabs] SBD as watchdog daemon

Олег Самойлов splarv at ya.ru
Thu Apr 11 10:39:52 EDT 2019


Hi all.
I am developing HA PostgreSQL cluster for 2 or 3 datacenters. In case of DataCenter failure (blackout) the fencing will not work and will prevent to switching to working DC. So I disable the fencing. The cluster working is based on a quorum and I added a quorum device on a third DC in case of 2 DC. But I need somehow solve cases when corosync or pacemaker is freeze. In this case I use a hw watchdog or a softdog and SBD as watchdog daemon (without shared devices). Well, after this if I kill the corosync or the pacemakerd, all fine, the node is restarted. And if I freeze sbd by `killall -s STOP sbd`, all fine, reboots.  But if I freeze corosync or pacemakerd by `killall -s STOP` or by `ifdown eth0` (corosync is frozen in this case), nothing happened. The question is «Is this is fixed in the master branch or in 1.4.0?» (I use centos rpms: sbd v1.3.1) or where I need to look for (in what file, function) to fix this.


More information about the Users mailing list