<div dir="ltr"><div>Hi All,</div><div><br></div><div>I have created active/passive pacemaker cluster on RHEL 7.</div><div><br></div><div>Here are my environment:</div><div>
<div><div><div>clustera : 192.168.11.1 (passive)<br></div>clusterb : 192.168.11.2 (master)<br></div>clustera-ilo4 : 192.168.11.10<br></div>clusterb-ilo4 : 192.168.11.11 </div><div><br></div><div>cluster resource status :<br></div><div> cluster_fs started on clusterb</div><div> cluster_vip started on clusterb</div><div> cluster_sid started on clusterb</div><div> cluster_listnr started on clusterb<br></div><div><br></div><div>Both cluster node are online status.<br></div><div><br></div><div>i found my corosync.log contain many records like below:</div><div><br></div><div>clustera pengine: info: determine_online_status_fencing: Node clusterb is active<br>clustera pengine: info: determine_online_status: Node clusterb is online<br>clustera pengine: info: determine_online_status_fencing: Node clustera is active<br>clustera pengine: info: determine_online_status: Node clustera is online</div><div><br></div><div><b>clustera pengine: warning: unpack_rsc_op_failure: Processing failed op start for cluster_sid on clustera: unknown error (1)</b></div><div><b>=> Question : Why pengine always trying to start cluster_sid on the passive node? how to fix it? </b><br></div><div><br></div><div>clustera pengine: info: native_print: ipmi-fence-clustera (stonith:fence_ipmilan): Started clustera<br>clustera pengine: info: native_print: ipmi-fence-clusterb (stonith:fence_ipmilan): Started clustera<br>clustera pengine: info: group_print: Resource Group: cluster<br>clustera pengine: info: native_print: cluster_fs (ocf::heartbeat:Filesystem): Started clusterb<br>clustera pengine: info: native_print: cluster_vip (ocf::heartbeat:IPaddr2): Started clusterb<br>clustera pengine: info: native_print: cluster_sid (ocf::heartbeat:oracle): Started clusterb<br>clustera pengine: info: native_print: cluster_listnr (ocf::heartbeat:oralsnr): Started clusterb<br>clustera pengine: info: get_failcount_full: cluster_sid has failed INFINITY times on clustera</div><div><b><br></b></div><div><b>clustera pengine: warning: common_apply_stickiness: Forcing cluster_sid away from clustera after 1000000 failures (max=1000000)<br></b></div><div><b>=> Question: too much trying result in forbid the resource start on clustera ?</b><br></div><div><br></div><div>Couple days ago, the clusterb has been stonith by unknown reason, but only "cluster_fs", "cluster_vip" moved to clustera successfully, but "cluster_sid" and "cluster_listnr" go to "STOP" status.</div><div>like below messages, is it related with "op start for cluster_sid on clustera..." ?</div><div><br></div><div>clustera pengine: warning: unpack_rsc_op_failure: Processing failed op start for cluster_sid on clustera: unknown error (1)<br>clustera pengine: info: native_print: ipmi-fence-clustera (stonith:fence_ipmilan): Started clustera<br>clustera pengine: info: native_print: ipmi-fence-clusterb (stonith:fence_ipmilan): Started clustera<br>clustera pengine: info: group_print: Resource Group: cluster<br>clustera pengine: info: native_print: cluster_fs (ocf::heartbeat:Filesystem): Started clusterb (UNCLEAN)<br>clustera pengine: info: native_print: cluster_vip (ocf::heartbeat:IPaddr2): Started clusterb (UNCLEAN)<br>clustera pengine: info: native_print: cluster_sid (ocf::heartbeat:oracle): Started clusterb (UNCLEAN)<br>clustera pengine: info: native_print: cluster_listnr (ocf::heartbeat:oralsnr): Started clusterb (UNCLEAN)<br>clustera pengine: info: get_failcount_full: cluster_sid has failed INFINITY times on clustera<br>clustera pengine: warning: common_apply_stickiness: Forcing cluster_sid away from clustera after 1000000 failures (max=1000000)<br>clustera pengine: info: rsc_merge_weights: cluster_fs: Rolling back scores from cluster_sid<br>clustera pengine: info: rsc_merge_weights: cluster_vip: Rolling back scores from cluster_sid<br>clustera pengine: info: rsc_merge_weights: cluster_sid: Rolling back scores from cluster_listnr<br>clustera pengine: info: native_color: Resource cluster_sid cannot run anywhere<br>clustera pengine: info: native_color: Resource cluster_listnr cannot run anywhere<br>clustera pengine: warning: custom_action: Action cluster_fs_stop_0 on clusterb is unrunnable (offline)<br>clustera pengine: info: RecurringOp: Start recurring monitor (20s) for cluster_fs on clustera<br>clustera pengine: warning: custom_action: Action cluster_vip_stop_0 on clusterb is unrunnable (offline)<br>clustera pengine: info: RecurringOp: Start recurring monitor (10s) for cluster_vip on clustera<br>clustera pengine: warning: custom_action: Action cluster_sid_stop_0 on clusterb is unrunnable (offline)<br>clustera pengine: warning: custom_action: Action cluster_sid_stop_0 on clusterb is unrunnable (offline)<br>clustera pengine: warning: custom_action: Action cluster_listnr_stop_0 on clusterb is unrunnable (offline)<br>clustera pengine: warning: custom_action: Action cluster_listnr_stop_0 on clusterb is unrunnable (offline)<br>clustera pengine: warning: stage6: Scheduling Node clusterb for STONITH<br>clustera pengine: info: native_stop_constraints: cluster_fs_stop_0 is implicit after clusterb is fenced<br>clustera pengine: info: native_stop_constraints: cluster_vip_stop_0 is implicit after clusterb is fenced<br>clustera pengine: info: native_stop_constraints: cluster_sid_stop_0 is implicit after clusterb is fenced<br>clustera pengine: info: native_stop_constraints: cluster_listnr_stop_0 is implicit after clusterb is fenced<br>clustera pengine: info: LogActions: Leave ipmi-fence-db01 (Started clustera)<br>clustera pengine: info: LogActions: Leave ipmi-fence-db02 (Started clustera)<br>clustera pengine: notice: LogActions: Move cluster_fs (Started clusterb -> clustera)<br>clustera pengine: notice: LogActions: Move cluster_vip (Started clusterb -> clustera)<br>clustera pengine: notice: LogActions: Stop cluster_sid (clusterb)<br>clustera pengine: notice: LogActions: Stop cluster_listnr (clusterb)<br>clustera pengine: warning: process_pe_message: Calculated Transition 26821: /var/lib/pacemaker/pengine/pe-warn-7.bz2<br>clustera crmd: info: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ]<br>clustera crmd: info: do_te_invoke: Processing graph 26821 (ref=pe_calc-dc-1526868653-26882) derived from /var/lib/pacemaker/pengine/pe-warn-7.bz2<br>clustera crmd: notice: te_fence_node: Executing reboot fencing operation (23) on clusterb (timeout=60000)<br><br></div><div><br></div><div>Thanks ~~~~<br></div><div><br></div><div>-- <br><div class="gmail_signature">Kind regards,<br>Albert Weng</div>
</div></div>