<div dir="ltr">Hello All, <div><br></div><div>I noticed something on our pacemaker test cluster. The cluster is configured to manage an underlying database using master slave primitive. </div><div><br></div><div>I ran a kill on the pacemaker process, all the other nodes kept showing the node online. I went on to kill the underlying database on the same node which would have been detected had the pacemaker on the node been online. The cluster did not detect that the database on the node has failed, the failover never occurred. </div><div><br></div><div>I went on to kill corosync on the same node and the cluster now marked the node as stopped and proceeded to elect a new master. </div><div><br></div><div><br></div><div>In a separate test. I killed the pacemaker process on the cluster DC, the cluster showed no change. I went on to change CIB on a different node. The CIB modify command timed out. Once that occurred, the node didn't failover even when I turned off corosync on cluster DC. The cluster didn't recover after this mishap. </div><div><br></div><div>Is this expected behavior? Is there a solution for when OOM decides to kill the pacemaker process? </div><div><br></div><div>I run pacemaker 1.1.14, with corosync 1.4. I have stonith disabled and quorum enabled. </div><div><br></div><div>Thank you,</div><div><br></div><div>nwarriorch</div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div></div>