[ClusterLabs] Antw: Re: Fence agent ends up stopped with no clear reason why
Ulrich Windl
Ulrich.Windl at rz.uni-regensburg.de
Thu Aug 2 02:21:39 EDT 2018
Hi!
I think "Processing failed op start for vmware_fence on q-gp2-dbpg57-3:
unknown error (1)" is the reason. You should investigate why it could not be
started.
Regards,
Ulrich
>>> Casey Allen Shobe <casey.allen.shobe at icloud.com> schrieb am 01.08.2018 um
21:43
in Nachricht <1ABDA8CB-59C0-467C-B540-1FF498430D1B at icloud.com>:
> Here is the corosync.log for the first host in the list at the indicated
> time. Not sure what it's doing or why ‑ all cluster nodes were up and
running
> the entire time...no fencing events.
>
> Jul 30 21:46:30 [3878] q‑gp2‑dbpg57‑1 cib: info: cib_perform_op:
> Diff: ‑‑‑ 0.700.4 2
> Jul 30 21:46:30 [3878] q‑gp2‑dbpg57‑1 cib: info: cib_perform_op:
> Diff: +++ 0.700.5 (null)
> Jul 30 21:46:30 [3878] q‑gp2‑dbpg57‑1 cib: info: cib_perform_op:
> + /cib: @num_updates=5
> Jul 30 21:46:30 [3878] q‑gp2‑dbpg57‑1 cib: info: cib_perform_op:
> +
>
/cib/status/node_state[@id='3']/lrm[@id='3']/lrm_resources/lrm_resource[@id='
> vmware_fence']/lrm_rsc_op[@id='vmware_fence_last_0']:
> @operation_key=vmware_fence_start_0, @operation=start,
> @transition‑key=42:5084:0:68fc0c5a‑8a09‑4d53‑90d5‑c1a237542060,
> @transition‑magic=4:1;42:5084:0:68fc0c5a‑8a09‑4d53‑90d5‑c1a237542060,
> @call‑id=42, @rc‑code=1, @op‑status=4, @exec‑time=1510
> Jul 30 21:46:30 [3878] q‑gp2‑dbpg57‑1 cib: info: cib_perform_op:
> +
>
/cib/status/node_state[@id='3']/lrm[@id='3']/lrm_resources/lrm_resource[@id='
> vmware_fence']/lrm_rsc_op[@id='vmware_fence_last_failure_0']:
> @operation_key=vmware_fence_start_0, @operation=start,
> @transition‑key=42:5084:0:68fc0c5a‑8a09‑4d53‑90d5‑c1a237542060,
> @transition‑magic=4:1;42:5084:0:68fc0c5a‑8a09‑4d53‑90d5‑c1a237542060,
> @call‑id=42, @interval=0, @last‑rc‑change=1532987187, @exec‑time=1510,
> @op‑digest=8653f310a5c96a63ab95a
> Jul 30 21:46:30 [3878] q‑gp2‑dbpg57‑1 cib: info:
> cib_process_request: Completed cib_modify operation for section
> status: OK (rc=0, origin=q‑gp2‑dbpg57‑3/crmd/32, version=0.700.5)
> Jul 30 21:46:30 [3883] q‑gp2‑dbpg57‑1 crmd: notice:
> abort_transition_graph: Transition aborted by vmware_fence_start_0
> 'modify' on q‑gp2‑dbpg57‑3: Event failed
> (magic=4:1;42:5084:0:68fc0c5a‑8a09‑4d53‑90d5‑c1a237542060, cib=0.700.5,
> source=match_graph_event:381, 0)
> Jul 30 21:46:30 [3883] q‑gp2‑dbpg57‑1 crmd: info:
> abort_transition_graph: Transition aborted by vmware_fence_start_0
> 'modify' on q‑gp2‑dbpg57‑3: Event failed
> (magic=4:1;42:5084:0:68fc0c5a‑8a09‑4d53‑90d5‑c1a237542060, cib=0.700.5,
> source=match_graph_event:381, 0)
> Jul 30 21:46:30 [3883] q‑gp2‑dbpg57‑1 crmd: notice: run_graph:
> Transition 5084 (Complete=3, Pending=0, Fired=0, Skipped=0, Incomplete=1,
> Source=/var/lib/pacemaker/pengine/pe‑input‑729.bz2): Complete
> Jul 30 21:46:30 [3883] q‑gp2‑dbpg57‑1 crmd: info:
> do_state_transition: State transition S_TRANSITION_ENGINE ‑>
> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=notify_crmd ]
> Jul 30 21:46:30 [3878] q‑gp2‑dbpg57‑1 cib: info:
> cib_process_request: Forwarding cib_modify operation for section
> status to master (origin=local/attrd/46)
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> determine_online_status_fencing: Node q‑gp2‑dbpg57‑1 is active
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> determine_online_status: Node q‑gp2‑dbpg57‑1 is online
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> determine_online_status_fencing: Node q‑gp2‑dbpg57‑3 is active
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> determine_online_status: Node q‑gp2‑dbpg57‑3 is online
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> determine_online_status_fencing: Node q‑gp2‑dbpg57‑2 is active
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> determine_online_status: Node q‑gp2‑dbpg57‑2 is online
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> determine_op_status: Operation monitor found resource
> postgresql‑master‑vip active on q‑gp2‑dbpg57‑1
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> determine_op_status: Operation monitor found resource
> postgresql‑master‑vip active on q‑gp2‑dbpg57‑1
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> determine_op_status: Operation monitor found resource
> postgresql‑10‑main:0 active in master mode on q‑gp2‑dbpg57‑1
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> determine_op_status: Operation monitor found resource
> postgresql‑10‑main:0 active in master mode on q‑gp2‑dbpg57‑1
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> determine_op_status: Operation monitor found resource
> postgresql‑10‑main:1 active on q‑gp2‑dbpg57‑3
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> determine_op_status: Operation monitor found resource
> postgresql‑10‑main:1 active on q‑gp2‑dbpg57‑3
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: warning:
> unpack_rsc_op_failure: Processing failed op start for vmware_fence on
> q‑gp2‑dbpg57‑3: unknown error (1)
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: warning:
> unpack_rsc_op_failure: Processing failed op start for vmware_fence on
> q‑gp2‑dbpg57‑3: unknown error (1)
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: warning:
> unpack_rsc_op_failure: Processing failed op monitor for vmware_fence on
> q‑gp2‑dbpg57‑2: unknown error (1)
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> determine_op_status: Operation monitor found resource
> postgresql‑10‑main:2 active on q‑gp2‑dbpg57‑2
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> determine_op_status: Operation monitor found resource
> postgresql‑10‑main:2 active on q‑gp2‑dbpg57‑2
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info: native_print:
> postgresql‑master‑vip (ocf::heartbeat:IPaddr2): Started
q‑gp2‑dbpg57‑1
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info: clone_print:
> Master/Slave Set: postgresql‑ha [postgresql‑10‑main]
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info: short_print:
> Masters: [ q‑gp2‑dbpg57‑1 ]
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info: short_print:
> Slaves: [ q‑gp2‑dbpg57‑2 q‑gp2‑dbpg57‑3 ]
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info: native_print:
> vmware_fence (stonith:fence_vmware_rest): FAILED q‑gp2‑dbpg57‑3
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
get_failcount_full:
> vmware_fence has failed 5 times on q‑gp2‑dbpg57‑2
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: warning:
> common_apply_stickiness: Forcing vmware_fence away from q‑gp2‑dbpg57‑2
after
> 5 failures (max=5)
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
get_failcount_full:
> vmware_fence has failed 1 times on q‑gp2‑dbpg57‑3
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> common_apply_stickiness: vmware_fence can fail 4 more times on
q‑gp2‑dbpg57‑3
> before being forced off
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info: master_color:
> Promoting postgresql‑10‑main:0 (Master q‑gp2‑dbpg57‑1)
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info: master_color:
> postgresql‑ha: Promoted 1 instances of a possible 1 to master
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info: RecurringOp:
> Start recurring monitor (60s) for vmware_fence on q‑gp2‑dbpg57‑3
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info: LogActions:
Leave
> postgresql‑master‑vip (Started q‑gp2‑dbpg57‑1)
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info: LogActions:
Leave
> postgresql‑10‑main:0 (Master q‑gp2‑dbpg57‑1)
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info: LogActions:
Leave
> postgresql‑10‑main:1 (Slave q‑gp2‑dbpg57‑3)
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info: LogActions:
Leave
> postgresql‑10‑main:2 (Slave q‑gp2‑dbpg57‑2)
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: notice: LogActions:
Recover
> vmware_fence (Started q‑gp2‑dbpg57‑3)
> Jul 30 21:46:30 [3878] q‑gp2‑dbpg57‑1 cib: info: cib_perform_op:
> Diff: ‑‑‑ 0.700.5 2
> Jul 30 21:46:30 [3878] q‑gp2‑dbpg57‑1 cib: info: cib_perform_op:
> Diff: +++ 0.700.6 (null)
> Jul 30 21:46:30 [3878] q‑gp2‑dbpg57‑1 cib: info: cib_perform_op:
> + /cib: @num_updates=6
> Jul 30 21:46:30 [3878] q‑gp2‑dbpg57‑1 cib: info: cib_perform_op:
> +
>
/cib/status/node_state[@id='3']/transient_attributes[@id='3']/instance_attrib
> utes[@id='status‑3']/nvpair[@id='status‑3‑fail‑count‑vmware_fence']:
> @value=INFINITY
> Jul 30 21:46:30 [3883] q‑gp2‑dbpg57‑1 crmd: info:
> do_state_transition: State transition S_POLICY_ENGINE ‑>
> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE
> origin=handle_response ]
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: notice:
process_pe_message:
> Calculated Transition 5085: /var/lib/pacemaker/pengine/pe‑input‑730.bz2
> Jul 30 21:46:30 [3883] q‑gp2‑dbpg57‑1 crmd: notice:
> abort_transition_graph: Transition aborted by
> status‑3‑fail‑count‑vmware_fence, fail‑count‑vmware_fence=INFINITY:
> Transient attribute change (modify cib=0.700.6,
source=abort_unless_down:329,
>
path=/cib/status/node_state[@id='3']/transient_attributes[@id='3']/instance_a
> ttributes[@id='status‑3']/nvpair[@id='status‑3‑fail‑count‑vmware_fence'],
0)
> Jul 30 21:46:30 [3878] q‑gp2‑dbpg57‑1 cib: info:
> cib_process_request: Completed cib_modify operation for section
> status: OK (rc=0, origin=q‑gp2‑dbpg57‑1/attrd/46, version=0.700.6)
> Jul 30 21:46:30 [3878] q‑gp2‑dbpg57‑1 cib: info:
> cib_process_request: Forwarding cib_modify operation for section
> status to master (origin=local/attrd/47)
> Jul 30 21:46:30 [3881] q‑gp2‑dbpg57‑1 attrd: info:
attrd_cib_callback:
> Update 46 for fail‑count‑vmware_fence: OK (0)
> Jul 30 21:46:30 [3881] q‑gp2‑dbpg57‑1 attrd: info:
attrd_cib_callback:
> Update 46 for fail‑count‑vmware_fence[q‑gp2‑dbpg57‑2]=5: OK (0)
> Jul 30 21:46:30 [3881] q‑gp2‑dbpg57‑1 attrd: info:
attrd_cib_callback:
> Update 46 for fail‑count‑vmware_fence[q‑gp2‑dbpg57‑3]=INFINITY: OK (0)
> Jul 30 21:46:30 [3878] q‑gp2‑dbpg57‑1 cib: info: cib_perform_op:
> Diff: ‑‑‑ 0.700.6 2
> Jul 30 21:46:30 [3878] q‑gp2‑dbpg57‑1 cib: info: cib_perform_op:
> Diff: +++ 0.700.7 (null)
> Jul 30 21:46:30 [3878] q‑gp2‑dbpg57‑1 cib: info: cib_perform_op:
> + /cib: @num_updates=7
> Jul 30 21:46:30 [3878] q‑gp2‑dbpg57‑1 cib: info: cib_perform_op:
> +
>
/cib/status/node_state[@id='3']/lrm[@id='3']/lrm_resources/lrm_resource[@id='
> vmware_fence']/lrm_rsc_op[@id='vmware_fence_last_0']:
> @operation_key=vmware_fence_stop_0, @operation=stop,
> @transition‑key=4:5085:0:68fc0c5a‑8a09‑4d53‑90d5‑c1a237542060,
> @transition‑magic=0:0;4:5085:0:68fc0c5a‑8a09‑4d53‑90d5‑c1a237542060,
@call‑id=43,
> @rc‑code=0, @op‑status=0, @last‑run=1532987190, @last‑rc‑change=1532987190,
> @exec‑time=0
> Jul 30 21:46:30 [3883] q‑gp2‑dbpg57‑1 crmd: notice: run_graph:
> Transition 5085 (Complete=2, Pending=0, Fired=0, Skipped=1, Incomplete=2,
> Source=/var/lib/pacemaker/pengine/pe‑input‑730.bz2): Stopped
> Jul 30 21:46:30 [3883] q‑gp2‑dbpg57‑1 crmd: info:
> do_state_transition: State transition S_TRANSITION_ENGINE ‑>
> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=notify_crmd ]
> Jul 30 21:46:30 [3878] q‑gp2‑dbpg57‑1 cib: info:
> cib_process_request: Completed cib_modify operation for section
> status: OK (rc=0, origin=q‑gp2‑dbpg57‑3/crmd/33, version=0.700.7)
> Jul 30 21:46:30 [3878] q‑gp2‑dbpg57‑1 cib: info: cib_perform_op:
> Diff: ‑‑‑ 0.700.7 2
> Jul 30 21:46:30 [3878] q‑gp2‑dbpg57‑1 cib: info: cib_perform_op:
> Diff: +++ 0.700.8 (null)
> Jul 30 21:46:30 [3878] q‑gp2‑dbpg57‑1 cib: info: cib_perform_op:
> + /cib: @num_updates=8
> Jul 30 21:46:30 [3878] q‑gp2‑dbpg57‑1 cib: info: cib_perform_op:
> +
>
/cib/status/node_state[@id='3']/transient_attributes[@id='3']/instance_attrib
> utes[@id='status‑3']/nvpair[@id='status‑3‑last‑failure‑vmware_fence']:
> @value=1532987190
> Jul 30 21:46:30 [3883] q‑gp2‑dbpg57‑1 crmd: info:
> abort_transition_graph: Transition aborted by
> status‑3‑last‑failure‑vmware_fence, last‑failure‑vmware_fence=1532987190:
> Transient attribute change (modify cib=0.700.8,
source=abort_unless_down:329,
>
path=/cib/status/node_state[@id='3']/transient_attributes[@id='3']/instance_a
> ttributes[@id='status‑3']/nvpair[@id='status‑3‑last‑failure‑vmware_fence'],
1)
> Jul 30 21:46:30 [3878] q‑gp2‑dbpg57‑1 cib: info:
> cib_process_request: Completed cib_modify operation for section
> status: OK (rc=0, origin=q‑gp2‑dbpg57‑1/attrd/47, version=0.700.8)
> Jul 30 21:46:30 [3881] q‑gp2‑dbpg57‑1 attrd: info:
attrd_cib_callback:
> Update 47 for last‑failure‑vmware_fence: OK (0)
> Jul 30 21:46:30 [3881] q‑gp2‑dbpg57‑1 attrd: info:
attrd_cib_callback:
> Update 47 for last‑failure‑vmware_fence[q‑gp2‑dbpg57‑2]=1532448714: OK (0)
> Jul 30 21:46:30 [3881] q‑gp2‑dbpg57‑1 attrd: info:
attrd_cib_callback:
> Update 47 for last‑failure‑vmware_fence[q‑gp2‑dbpg57‑3]=1532987190: OK (0)
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> determine_online_status_fencing: Node q‑gp2‑dbpg57‑1 is active
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> determine_online_status: Node q‑gp2‑dbpg57‑1 is online
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> determine_online_status_fencing: Node q‑gp2‑dbpg57‑3 is active
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> determine_online_status: Node q‑gp2‑dbpg57‑3 is online
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> determine_online_status_fencing: Node q‑gp2‑dbpg57‑2 is active
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> determine_online_status: Node q‑gp2‑dbpg57‑2 is online
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> determine_op_status: Operation monitor found resource
> postgresql‑master‑vip active on q‑gp2‑dbpg57‑1
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> determine_op_status: Operation monitor found resource
> postgresql‑master‑vip active on q‑gp2‑dbpg57‑1
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> determine_op_status: Operation monitor found resource
> postgresql‑10‑main:0 active in master mode on q‑gp2‑dbpg57‑1
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> determine_op_status: Operation monitor found resource
> postgresql‑10‑main:0 active in master mode on q‑gp2‑dbpg57‑1
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> determine_op_status: Operation monitor found resource
> postgresql‑10‑main:1 active on q‑gp2‑dbpg57‑3
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> determine_op_status: Operation monitor found resource
> postgresql‑10‑main:1 active on q‑gp2‑dbpg57‑3
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: warning:
> unpack_rsc_op_failure: Processing failed op start for vmware_fence on
> q‑gp2‑dbpg57‑3: unknown error (1)
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: warning:
> unpack_rsc_op_failure: Processing failed op monitor for vmware_fence on
> q‑gp2‑dbpg57‑2: unknown error (1)
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> determine_op_status: Operation monitor found resource
> postgresql‑10‑main:2 active on q‑gp2‑dbpg57‑2
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
> determine_op_status: Operation monitor found resource
> postgresql‑10‑main:2 active on q‑gp2‑dbpg57‑2
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info: native_print:
> postgresql‑master‑vip (ocf::heartbeat:IPaddr2): Started
q‑gp2‑dbpg57‑1
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info: clone_print:
> Master/Slave Set: postgresql‑ha [postgresql‑10‑main]
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info: short_print:
> Masters: [ q‑gp2‑dbpg57‑1 ]
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info: short_print:
> Slaves: [ q‑gp2‑dbpg57‑2 q‑gp2‑dbpg57‑3 ]
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info: native_print:
> vmware_fence (stonith:fence_vmware_rest): Stopped
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
get_failcount_full:
> vmware_fence has failed 5 times on q‑gp2‑dbpg57‑2
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: warning:
> common_apply_stickiness: Forcing vmware_fence away from q‑gp2‑dbpg57‑2
after
> 5 failures (max=5)
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info:
get_failcount_full:
> vmware_fence has failed INFINITY times on q‑gp2‑dbpg57‑3
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: warning:
> common_apply_stickiness: Forcing vmware_fence away from q‑gp2‑dbpg57‑3
after
> 1000000 failures (max=5)
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info: master_color:
> Promoting postgresql‑10‑main:0 (Master q‑gp2‑dbpg57‑1)
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info: master_color:
> postgresql‑ha: Promoted 1 instances of a possible 1 to master
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info: RecurringOp:
> Start recurring monitor (60s) for vmware_fence on q‑gp2‑dbpg57‑1
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info: LogActions:
Leave
> postgresql‑master‑vip (Started q‑gp2‑dbpg57‑1)
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info: LogActions:
Leave
> postgresql‑10‑main:0 (Master q‑gp2‑dbpg57‑1)
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info: LogActions:
Leave
> postgresql‑10‑main:1 (Slave q‑gp2‑dbpg57‑3)
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: info: LogActions:
Leave
> postgresql‑10‑main:2 (Slave q‑gp2‑dbpg57‑2)
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: notice: LogActions:
Start
> vmware_fence (q‑gp2‑dbpg57‑1)
> Jul 30 21:46:30 [3883] q‑gp2‑dbpg57‑1 crmd: info:
> do_state_transition: State transition S_POLICY_ENGINE ‑>
> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE
> origin=handle_response ]
> Jul 30 21:46:30 [3882] q‑gp2‑dbpg57‑1 pengine: notice:
process_pe_message:
> Calculated Transition 5086: /var/lib/pacemaker/pengine/pe‑input‑731.bz2
> Jul 30 21:46:30 [3880] q‑gp2‑dbpg57‑1 lrmd: info: log_execute:
> executing ‑ rsc:vmware_fence action:start call_id:77
> Jul 30 21:46:30 [3879] q‑gp2‑dbpg57‑1 stonith‑ng: warning: log_action:
> fence_vmware_rest[5739] stderr: [ 2018‑07‑30 21:46:30,895 ERROR: Unable to
> connect/login to fencing device ]
> Jul 30 21:46:30 [3879] q‑gp2‑dbpg57‑1 stonith‑ng: warning: log_action:
> fence_vmware_rest[5739] stderr: [ ]
> Jul 30 21:46:30 [3879] q‑gp2‑dbpg57‑1 stonith‑ng: warning: log_action:
> fence_vmware_rest[5739] stderr: [ ]
> Jul 30 21:46:30 [3879] q‑gp2‑dbpg57‑1 stonith‑ng: info:
> internal_stonith_action_execute: Attempt 2 to execute fence_vmware_rest
> (monitor). remaining timeout is 20
>
>
>> On 2018‑08‑01, at 1:39 PM, Casey Allen Shobe <casey.allen.shobe at icloud.com>
> wrote:
>>
>> Across our clusters, I see the fence agent stop working, with no apparent
> reason. It looks like shown below. I've found that I can do a `pcs
resource
> cleanup vmware_fence` to cause it to start back up again in a few seconds,
> but why is this happening and how can I prevent it?
>>
>> vmware_fence (stonith:fence_vmware_rest): Stopped
>>
>> Failed Actions:
>> * vmware_fence_start_0 on q‑gp2‑dbpg57‑1 'unknown error' (1): call=77,
> status=Error, exitreason='none',
>> last‑rc‑change='Mon Jul 30 21:46:30 2018', queued=1ms, exec=1862ms
>> * vmware_fence_start_0 on q‑gp2‑dbpg57‑3 'unknown error' (1): call=42,
> status=Error, exitreason='none',
>> last‑rc‑change='Mon Jul 30 21:46:27 2018', queued=0ms, exec=1510ms
>> * vmware_fence_monitor_60000 on q‑gp2‑dbpg57‑2 'unknown error' (1):
call=84,
> status=Error, exitreason='none',
>> last‑rc‑change='Tue Jul 24 16:11:42 2018', queued=0ms, exec=12142ms
>>
>> Thank you,
>> ‑‑
>> Casey
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> https://lists.clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Users
mailing list