[ClusterLabs] Pacemaker kill does not cause node fault ???

Tue Jan 31 01:28:46 CET 2017

On 01/10/2017 04:24 AM, Stefan Schloesser wrote:
> Hi,
> 
> I am currently testing a 2 node cluster under Ubuntu 16.04. The setup seems to be working ok including the STONITH.
> For test purposes I issued a "pkill -f pace" killing all pacemaker processes on one node.
> 
> Result:
> The node is marked as "pending", all resources stay on it. If I manually kill a resource it is not noticed. On the other node a drbd "promote" command fails (drbd is still running as master on the first node).
> 
> Killing the corosync process works as expected -> STONITH.
> 
> Could someone shed some light on this behavior? 
> 
> Thanks,
> 
> Stefan

I suspect that, when you kill pacemakerd, systemd respawns it quickly
enough that fencing is unnecessary. Try "pkill -f pace; systemd stop
pacemaker".

Did you schedule monitor operations on your resources? If not, pacemaker
will not know if they go down.