Hello,<br>I have a 2 node cluster with following configuration:<br><b style="color:rgb(51,102,255)">*node $id="9e53a111-0dca-496c-9461-a38f3eec4d0e" mcg2 \<br> attributes standby="off"<br>node $id="a90981f8-d993-4411-89f4-aff7156136d2" mcg1 \<br>
attributes standby="off"<br>primitive ClusterIP ocf:mcg:MCG_VIPaddr_RA \<br> params ip="192.168.115.50" cidr_netmask="255.255.255.0" <br>nic="bond1.115:1" \<br> op monitor interval="40" timeout="20" \<br>
meta target-role="Started"<br>primitive EMS ocf:heartbeat:jboss \<br> params jboss_home="/opt/<a href="http://jboss-5.1.0.GA">jboss-5.1.0.GA</a>" <br>java_home="/opt/jdk1.6.0_29/" \<br>
op start interval="0" timeout="240" \<br> op stop interval="0" timeout="240" \<br> op monitor interval="30s" timeout="40s"<br>primitive NDB_MGMT ocf:mcg:NDB_MGM_RA \<br>
op monitor interval="120" timeout="120"<br>primitive NDB_VIP ocf:heartbeat:IPaddr2 \<br> params ip="192.168.117.50" cidr_netmask="255.255.255.255" <br>nic="bond0.117:1" \</b><br>
<b style="color:rgb(51,102,255)"> op monitor interval="30" timeout="10"<br>primitive Rmgr ocf:mcg:RM_RA \<br> op monitor interval="60" role="Master" timeout="30" <br>
on-fail="restart" \<br> op monitor interval="40" role="Slave" timeout="40" on-fail="restart"<br>primitive Tmgr ocf:mcg:TM_RA \<br> op monitor interval="60" role="Master" timeout="30" <br>
on-fail="restart" \<br> op monitor interval="40" role="Slave" timeout="40" on-fail="restart"<br>primitive mysql ocf:mcg:MYSQLD_RA \<br> op monitor interval="180" timeout="200"<br>
primitive ndbd ocf:mcg:NDBD_RA \<br> op monitor interval="120" timeout="120"<br>primitive pimd ocf:mcg:PIMD_RA \<br> op monitor interval="60" role="Master" timeout="30" <br>
on-fail="restart" \<br> op monitor interval="40" role="Slave" timeout="40" on-fail="restart"<br>ms ms_Rmgr Rmgr \<br> meta master-max="1" master-max-node="1" clone-max="2" <br>
clone-node-max="1" interleave="true" notify="true"<br>ms ms_Tmgr Tmgr \<br> meta master-max="1" master-max-node="1" clone-max="2" <br>clone-node-max="1" interleave="true" notify="true"<br>
ms ms_pimd pimd \<br> meta master-max="1" master-max-node="1" clone-max="2" <br>clone-node-max="1" interleave="true" notify="true"<br>clone EMS_CLONE EMS \<br>
meta globally-unique="false" clone-max="2" clone-node-max="1" <br>target-role="Started"<br>clone mysqld_clone mysql \<br> meta globally-unique="false" clone-max="2" clone-node-max="1"<br>
clone ndbdclone ndbd \<br> meta globally-unique="false" clone-max="2" clone-node-max="1" <br>target-role="Started"<br>colocation ip_with_Pimd inf: ClusterIP ms_pimd:Master<br>
colocation ip_with_RM inf: ClusterIP ms_Rmgr:Master<br>colocation ip_with_TM inf: ClusterIP ms_Tmgr:Master<br>colocation ndb_vip-with-ndb_mgm inf: NDB_MGMT NDB_VIP<br>order RM-after-mysqld inf: mysqld_clone ms_Rmgr<br>order TM-after-RM inf: ms_Rmgr ms_Tmgr<br>
order ip-after-pimd inf: ms_pimd ClusterIP<br>order mysqld-after-ndbd inf: ndbdclone mysqld_clone<br>order pimd-after-TM inf: ms_Tmgr ms_pimd<br>property $id="cib-bootstrap-options" \<br> dc-version="1.0.11-55a5f5be61c367cbd676c2f0ec4f1c62b38223d7" \<br>
cluster-infrastructure="Heartbeat" \<br> no-quorum-policy="ignore" \<br> stonith-enabled="false"<br>rsc_defaults $id="rsc-options" \<br> migration_threshold="3" \<br>
resource-stickiness="100"*<br><br></b><span style="color:rgb(51,102,255)"><span style="color:rgb(0,0,0)">With both nodes up and running, if heartbeat service is stopped on any of </span></span><br>the nodes, following resources are restarted on the other node:<br>
mysqld_clone, ms_Rmgr, ms_Tmgr, ms_pimd, ClusterIP<br><br>From the Heartbeat debug logs, it seems policy engine is initiating a restart operation for the above resources but the reason for the same is not clear.<br><br>Following are some excerpts from the logs:<br>
<br>"<b style="color:rgb(51,102,255)">Feb 07 11:06:31 MCG1 pengine: [20534]: info: determine_online_status: Node mcg2 is shutting down<br>Feb 07 11:06:31 MCG1 pengine: [20534]: info: determine_online_status: Node mcg1 is online<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: notice: clone_print: Master/Slave Set: ms_Rmgr<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource Rmgr:0 active on mcg1<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_ac</b><b style="color:rgb(51,102,255)">tive: Resource Rmgr:0 active on mcg1<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource</b><b style="color:rgb(51,102,255)"> Rmgr:1 active on mcg2<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource Rmgr:1 active on mcg2<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: notice: short_print: Masters: [ mcg1 ]<br>Feb 07 11:06:31 MCG1 pengine: [20534]: notice: short_print: Slaves: [ mcg2 ]<br>Feb 07 11:06:31 MCG1 pengine: [20534]: notice: clone_print: Master/Slave Set: ms_Tmgr<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource Tmgr:0 active on mcg1<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource Tmgr:0 active on mcg1<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource Tmgr:1 active on mcg2<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource Tmgr:1 active on mcg2<br>Feb 07 11:06:31 MCG1 pengine: [20534]: notice: short_print: Masters: [ mcg1 ]<br>Feb 07 11:06:31 MCG1 pengine: [20534]: notice: short_print: Slaves: [ mcg2 ]<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: notice: clone_print: Master/Slave Set: ms_pimd<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource pimd:0 active on mcg1<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource pimd:0 active on mcg1<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource pimd:1 active on mcg2<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource pimd:1 active on mcg2<br>Feb 07 11:06:31 MCG1 pengine: [20534]: notice: short_print: Masters: [ mcg1 ]<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: notice: short_print: Slaves: [ mcg2 ]<br>Feb 07 11:06:31 MCG1 pengine: [20534]: notice: native_print: ClusterIP (ocf::mcg:MCG_VIPaddr_RA): Started mcg1<br>Feb 07 11:06:31 MCG1 pengine: [20534]: notice: clone_print: Clone Set: EMS_CLONE</b><br>
<b style="color:rgb(51,102,255)">Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource EMS:0 active on mcg1<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource EMS:0 active on mcg1<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource EMS:1 active on mcg2<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource EMS:1 active on mcg2<br>Feb 07 11:06:31 MCG1 pengine: [20534]: notice: short_print: Started: [ mcg1 mcg2 ]<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: notice: native_print: NDB_VIP (ocf::heartbeat:IPaddr2): Started mcg1<br>Feb 07 11:06:31 MCG1 pengine: [20534]: notice: native_print: NDB_MGMT (ocf::mcg:NDB_MGM_RA): Started mcg1<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: notice: clone_print: Clone Set: mysqld_clone<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: nati</b><b style="color:rgb(51,102,255)">ve_active: Resource mysql:0 active on mcg1<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource mysql:0 active on mcg1<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource mysql:1 active on mcg2<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource mysql:1 active on mcg2<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: notice: short_print: Started: [ mcg1 mcg2 ]<br>Feb 07 11:06:31 MCG1 pengine: [20534]: notice: clone_print: Clone Set: ndbdclone<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource ndbd:0 active on mcg1<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource ndbd:0 active on mcg1<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource ndbd:1 active on mcg2<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource ndbd:1 active on mcg2<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: notice: short_print: Started: [ mcg1 mcg2 ]<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource Rmgr:1: preferring current location (node=mcg2, weight=100)<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource Tmgr:1: preferring current location (node=mcg2, weight=100)</b><br>Fe<b style="color:rgb(51,102,255)">b 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource pimd:1: preferring current location (node=mcg2, weight=100)<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource EMS:1: preferring current location (node=mcg2, weight=100)<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource mysql:1: preferring current location (node=mcg2, weight=100)<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource ndbd:1: preferring current location (node=mcg2, weight=100)<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource Rmgr:0: preferring current location (node=mcg1, weight=100)<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource Tmgr:0: preferring current location (node=mcg1, weight=100)<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource pimd:0: preferring current location (node=mcg1, weight=100)</b><b style="color:rgb(51,102,255)"><br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource ClusterIP: preferring current location (node=mcg1, weight=100)<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource EMS:0: preferring current location (node=mcg1, weight=100)<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource NDB_VIP: preferring current location (node=mcg1, weight=100)<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource NDB_MGMT: preferring current location (node=mcg1, weight=100)<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource mysql:0: preferring current location (node=mcg1, weight=100)<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource ndbd:0: preferring current location (node=mcg1, weight=100)<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Assigning mcg1 to Rmgr:0<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: All nodes for resource Rmgr:1 are unavailable, unclean or shutting down (mcg2: 0, -1000000)<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Could not allocate a node for Rmgr:1<br>Feb 07 11:06:31 MCG1 pengine: [20534]: info: native_color: Resource Rmgr:1 cannot run anywhere<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_color: Allocated 1 ms_Rmgr instances of a possible 2<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: master_color: Rmgr:0 master score: 10<br>Feb 07 11:06:31 MCG1 pengine: [20534]: info: master_color: Promoting Rmgr:0 (Master mcg1)<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: master_color: Rmgr:1 master score: 0<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: info: master_color: ms_Rmgr: Promoted 1 instances of a possible 1 to master<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Assigning mcg1 to Tmgr:0<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: All nodes for resource Tmgr:1 are unavailable, unclean or shutting down (mcg2: 0, -1000000)<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Could not allocate a node for Tmgr:1<br>Feb 07 11:06:31 MCG1 pengine: [20534]: info: native_color: Resource Tmgr:1 cannot run anywhere</b><br><b style="color:rgb(51,102,255)">Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_color: Allocated 1 ms_Tmgr instances of a possible 2<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: master_color: Tmgr:0 master score: 10<br>Feb 07 11:06:31 MCG1 pengine: [20534]: info: master_color: Promoting Tmgr:0 (Master mcg1)<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: master_color: Tmgr:1 master score: 0<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: info: master_color: ms_Tmgr: Promoted 1 instances of a possible 1 to master<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Assigning mcg1 to pimd:0<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: All nodes for resource pimd:1 are unavailable, unclean or shutting down (mcg2: 0, -1000000)<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Could not allocate a node for pimd:1<br>Feb 07 11:06:31 MCG1 pengine: [20534]: info: native_color: Resource pimd:1 cannot run anywhere</b><br><b style="color:rgb(51,102,255)">Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_color: Allocated 1 ms_pimd instances of a possible 2<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: master_color: pimd:0 master score: 10<br>Feb 07 11:06:31 MCG1 pengine: [20534]: info: master_color: Promoting pimd:0 (Master mcg1)<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: master_color: pimd:1 master score: 0<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: info: master_color: ms_pimd: Promoted 1 instances of a possible 1 to master<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Assigning mcg1 to ClusterIP<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Assigning mcg1 to EMS:0<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: All nodes for resource EMS:1 are unavailable, unclean or shutting down (mcg2: 0, -1000000)<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Could not allocate a node for EMS:1<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: info: native_color: Resource EMS:1 cannot run anywhere<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_color: Allocated 1 EMS_CLONE instances of a possible 2<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Assigning mcg1 to NDB_VIP<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Assigning mcg1 to NDB_MGMT<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Assigning mcg1 to mysql:0<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: All nodes for resource mysql:1 are unavailable, unclean or shutting down (mcg2: 0, -1000000)<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Could not allocate a node for mysql:1<br>Feb 07 11:06:31 MCG1 pengine: [20534]: info: native_color: Resource mysql:1 cannot run anywhere<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_color: Allocated 1 mysqld_clone instances of a possible 2<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Assigning mcg1 to ndbd:0<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: All nodes for resource ndbd:1 are unavailable, unclean or shutting down (mcg2: 0, -1000000)<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Could not allocate a node for ndbd:1<br>Feb 07 11:06:31 MCG1 pengine: [20534]: info: native_color: Resource ndbd:1 cannot run anywhere<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_color: Allocated 1 ndbdclone instances of a possible 2<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: master_create_actions: Creating actions for ms_Rmgr<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: master_create_actions: Creating actions for ms_Tmgr<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: master_create_actions: Creating actions for ms_pimd<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: info: stage6: Scheduling Node mcg2 for shutdown<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_rsc_order_lh: Pairing Rmgr:0 with Tmgr:0<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: find_compatible_child: Can't pair Tmgr:1 with ms_Rmgr<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_rsc_order_lh: No match found for Tmgr:1 (0)<br>Feb 07 11:06:31 MCG1 pengine: [20534]: info: clone_rsc_order_lh: Inhibiting Tmgr:1 from being active<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Could not allocate a node for Tmgr:1<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_rsc_order_lh: Pairing Tmgr:0 with Rmgr:0<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_rsc_order_lh: Pairing Tmgr:1 with Rmgr:1<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_rsc_order_lh: Pairing Tmgr:0 with pimd:0<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: find_compatible_child: Can't pair pimd:1 with ms_Tmgr<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_rsc_order_lh: No match found for pimd:1 (0)<br>Feb 07 11:06:31 MCG1 pengine: [20534]: info: clone_rsc_order_lh: Inhibiting pimd:1 from being active<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Could not allocate a node for pimd:1<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_rsc_order_lh: Pairing pimd:0 with Tmgr:0<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_rsc_order_lh: Pairing pimd:1 with Tmgr:1<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_rsc_order_lh: Pairing Rmgr:0 with mysql:0<br>Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_rsc_order_lh: Pairing Rmgr:1 with mysql:1<br>Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Restart resource Rmgr:0 (Master mcg1)<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Stop resource Rmgr:1 (mcg2)<br>Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Restart resource Tmgr:0 (Master mcg1)<br>Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Stop resource Tmgr:1 (mcg2)<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Restart resource pimd:0 (Master mcg1)<br>Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Stop resource pimd:1 (mcg2)<br>Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Restart resource ClusterIP (Started mcg1)<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Leave resource EMS:0 (Started mcg1)<br>Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Stop resource EMS:1 (mcg2)<br>Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Leave resource NDB_VIP (Started mcg1)<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Leave resource NDB_MGMT (Started mcg1)<br>Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Restart resource mysql:0 (Started mcg1)<br>Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Stop resource mysql:1 (mcg2)<br>
Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Leave resource ndbd:0 (Started mcg1)<br>Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Stop resource ndbd:1 (mcg2)<br>"<br></b>Thanks in advance.<br>
<br>Regards<br>Neha Chatrath <br>