[ClusterLabs] Pacemaker stopped monitoring the resource

Abhay B abhayyb at gmail.com
Thu Aug 31 02:41:26 EDT 2017


Hi,

I have a 2 node HA cluster configured on CentOS 7 with pcs command.

Below are the properties of the cluster :

# pcs property
Cluster Properties:
 cluster-infrastructure: corosync
 cluster-name: SVSDEHA
 cluster-recheck-interval: 2s
 dc-deadtime: 5
 dc-version: 1.1.15-11.el7_3.5-e174ec8
 have-watchdog: false
 last-lrm-refresh: 1504090367
 no-quorum-policy: ignore
 start-failure-is-fatal: false
 stonith-enabled: false

PFA the cib.
Also attached is the corosync.log around the time the below issue happened.

After around 10 hrs and multiple failures, pacemaker stops monitoring
resource on one of the nodes in the cluster.

So even though the resource on other node fails, it is never migrated to
the node on which the resource is not monitored.

Wanted to know what could have triggered this and how to avoid getting into
such scenarios.
I am going through the logs and couldn't find why this happened.

After this log the monitoring stopped.
*Aug 29 11:01:44 [16500] TPC-D12-10-002.phaedrus.sandvine.com
<http://TPC-D12-10-002.phaedrus.sandvine.com>       crmd:     info:
process_lrm_event:   Result of monitor operation for SVSDEHA on
TPC-D12-10-002.phaedrus.sandvine.com
<http://TPC-D12-10-002.phaedrus.sandvine.com>: 0 (ok) | call=538
key=SVSDEHA_monitor_2000 confirmed=false cib-update=50013*

Below log says the resource is leaving the cluster.

*Aug 29 11:01:44 [16499] TPC-D12-10-002.phaedrus.sandvine.com
<http://TPC-D12-10-002.phaedrus.sandvine.com>    pengine:     info:
LogActions:  Leave   SVSDEHA:0       (Slave
TPC-D12-10-002.phaedrus.sandvine.com
<http://TPC-D12-10-002.phaedrus.sandvine.com>)*

Let me know if anything more is needed.

Regards,
Abhay

*PS:'pcs resource cleanup' brought the cluster back into good state. *
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20170831/e49deebf/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cib.xml
Type: text/xml
Size: 7659 bytes
Desc: not available
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20170831/e49deebf/attachment-0002.xml>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: corosyn_filt.log
Type: application/octet-stream
Size: 327548 bytes
Desc: not available
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20170831/e49deebf/attachment-0002.obj>


More information about the Users mailing list