[ClusterLabs] Pengine always trying to start the resource on the standby node.
Albert Weng
weng.albert at gmail.com
Tue Jun 5 21:27:29 EDT 2018
Hi All,
I have created active/passive pacemaker cluster on RHEL 7.
Here are my environment:
clustera : 192.168.11.1 (passive)
clusterb : 192.168.11.2 (master)
clustera-ilo4 : 192.168.11.10
clusterb-ilo4 : 192.168.11.11
cluster resource status :
cluster_fs started on clusterb
cluster_vip started on clusterb
cluster_sid started on clusterb
cluster_listnr started on clusterb
Both cluster node are online status.
i found my corosync.log contain many records like below:
clustera pengine: info: determine_online_status_fencing:
Node clusterb is active
clustera pengine: info: determine_online_status: Node
clusterb is online
clustera pengine: info: determine_online_status_fencing:
Node clustera is active
clustera pengine: info: determine_online_status: Node
clustera is online
*clustera pengine: warning: unpack_rsc_op_failure: Processing
failed op start for cluster_sid on clustera: unknown error (1)*
*=> Question : Why pengine always trying to start cluster_sid on the
passive node? how to fix it? *
clustera pengine: info: native_print: ipmi-fence-clustera
(stonith:fence_ipmilan): Started clustera
clustera pengine: info: native_print: ipmi-fence-clusterb
(stonith:fence_ipmilan): Started clustera
clustera pengine: info: group_print: Resource Group: cluster
clustera pengine: info: native_print: cluster_fs
(ocf::heartbeat:Filesystem): Started clusterb
clustera pengine: info: native_print: cluster_vip
(ocf::heartbeat:IPaddr2): Started clusterb
clustera pengine: info: native_print: cluster_sid
(ocf::heartbeat:oracle): Started clusterb
clustera pengine: info: native_print:
cluster_listnr (ocf::heartbeat:oralsnr): Started clusterb
clustera pengine: info: get_failcount_full: cluster_sid has
failed INFINITY times on clustera
*clustera pengine: warning: common_apply_stickiness: Forcing
cluster_sid away from clustera after 1000000 failures (max=1000000)*
*=> Question: too much trying result in forbid the resource start on
clustera ?*
Couple days ago, the clusterb has been stonith by unknown reason, but only
"cluster_fs", "cluster_vip" moved to clustera successfully, but
"cluster_sid" and "cluster_listnr" go to "STOP" status.
like below messages, is it related with "op start for cluster_sid on
clustera..." ?
clustera pengine: warning: unpack_rsc_op_failure: Processing failed op
start for cluster_sid on clustera: unknown error (1)
clustera pengine: info: native_print: ipmi-fence-clustera
(stonith:fence_ipmilan): Started clustera
clustera pengine: info: native_print: ipmi-fence-clusterb
(stonith:fence_ipmilan): Started clustera
clustera pengine: info: group_print: Resource Group: cluster
clustera pengine: info: native_print: cluster_fs
(ocf::heartbeat:Filesystem): Started clusterb (UNCLEAN)
clustera pengine: info: native_print: cluster_vip
(ocf::heartbeat:IPaddr2): Started clusterb (UNCLEAN)
clustera pengine: info: native_print: cluster_sid
(ocf::heartbeat:oracle): Started clusterb (UNCLEAN)
clustera pengine: info: native_print: cluster_listnr
(ocf::heartbeat:oralsnr): Started clusterb (UNCLEAN)
clustera pengine: info: get_failcount_full: cluster_sid has
failed INFINITY times on clustera
clustera pengine: warning: common_apply_stickiness: Forcing
cluster_sid away from clustera after 1000000 failures (max=1000000)
clustera pengine: info: rsc_merge_weights: cluster_fs: Rolling
back scores from cluster_sid
clustera pengine: info: rsc_merge_weights: cluster_vip: Rolling
back scores from cluster_sid
clustera pengine: info: rsc_merge_weights: cluster_sid: Rolling
back scores from cluster_listnr
clustera pengine: info: native_color: Resource cluster_sid cannot
run anywhere
clustera pengine: info: native_color: Resource cluster_listnr
cannot run anywhere
clustera pengine: warning: custom_action: Action cluster_fs_stop_0 on
clusterb is unrunnable (offline)
clustera pengine: info: RecurringOp: Start recurring monitor
(20s) for cluster_fs on clustera
clustera pengine: warning: custom_action: Action cluster_vip_stop_0 on
clusterb is unrunnable (offline)
clustera pengine: info: RecurringOp: Start recurring monitor
(10s) for cluster_vip on clustera
clustera pengine: warning: custom_action: Action cluster_sid_stop_0 on
clusterb is unrunnable (offline)
clustera pengine: warning: custom_action: Action cluster_sid_stop_0 on
clusterb is unrunnable (offline)
clustera pengine: warning: custom_action: Action cluster_listnr_stop_0
on clusterb is unrunnable (offline)
clustera pengine: warning: custom_action: Action cluster_listnr_stop_0
on clusterb is unrunnable (offline)
clustera pengine: warning: stage6: Scheduling Node clusterb for STONITH
clustera pengine: info: native_stop_constraints:
cluster_fs_stop_0 is implicit after clusterb is fenced
clustera pengine: info: native_stop_constraints:
cluster_vip_stop_0 is implicit after clusterb is fenced
clustera pengine: info: native_stop_constraints:
cluster_sid_stop_0 is implicit after clusterb is fenced
clustera pengine: info: native_stop_constraints:
cluster_listnr_stop_0 is implicit after clusterb is fenced
clustera pengine: info: LogActions: Leave ipmi-fence-db01
(Started clustera)
clustera pengine: info: LogActions: Leave ipmi-fence-db02
(Started clustera)
clustera pengine: notice: LogActions: Move cluster_fs
(Started clusterb -> clustera)
clustera pengine: notice: LogActions: Move cluster_vip
(Started clusterb -> clustera)
clustera pengine: notice: LogActions: Stop cluster_sid
(clusterb)
clustera pengine: notice: LogActions: Stop cluster_listnr
(clusterb)
clustera pengine: warning: process_pe_message: Calculated
Transition 26821: /var/lib/pacemaker/pengine/pe-warn-7.bz2
clustera crmd: info: do_state_transition: State transition
S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS
cause=C_IPC_MESSAGE origin=handle_response ]
clustera crmd: info: do_te_invoke: Processing graph 26821
(ref=pe_calc-dc-1526868653-26882) derived from
/var/lib/pacemaker/pengine/pe-warn-7.bz2
clustera crmd: notice: te_fence_node: Executing reboot fencing
operation (23) on clusterb (timeout=60000)
Thanks ~~~~
--
Kind regards,
Albert Weng
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
不含病毒。www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20180606/11bbcab9/attachment-0002.html>
More information about the Users
mailing list