May 17 20:38:40 [127218] ao-pg01-p.axadmin.net pengine: info: unpack_node_loop: Node 1 is already processed May 17 20:38:40 [127218] ao-pg01-p.axadmin.net pengine: info: common_print: ao-cl-p-01-vip01 (ocf::heartbeat:IPaddr2): Started ao-pg01-p.axadmin.net May 17 20:38:40 [127218] ao-pg01-p.axadmin.net pengine: info: common_print: fence_ao_pg01 (stonith:fence_vmware_soap): Started ao-pg02-p.axadmin.net May 17 20:38:40 [127218] ao-pg01-p.axadmin.net pengine: info: common_print: fence_ao_pg02 (stonith:fence_vmware_soap): Started ao-pg01-p.axadmin.net May 17 20:38:40 [127218] ao-pg01-p.axadmin.net pengine: info: pe_get_failcount: fence_ao_pg02 has failed 12 times on ao-pg01-p.axadmin.net May 17 20:38:40 [127218] ao-pg01-p.axadmin.net pengine: info: check_migration_threshold: fence_ao_pg02 can fail 999988 more times on ao-pg01-p.axadmin.net before being forced off ... ... May 17 20:52:33 [127215] ao-pg01-p.axadmin.net stonith-ng: info: st_child_term: Child 48496 timed out, sending SIGTERM May 17 20:52:33 [127215] ao-pg01-p.axadmin.net stonith-ng: notice: stonith_action_async_done: Child process 48496 performing action 'monitor' timed out with signal 15 May 17 20:52:33 [127215] ao-pg01-p.axadmin.net stonith-ng: notice: log_operation: Operation 'monitor' [48496] for device 'fence_ao_pg02' returned: -62 (Timer expired) May 17 20:52:33 [127219] ao-pg01-p.axadmin.net crmd: error: process_lrm_event: Result of monitor operation for fence_ao_pg02 on ao-pg01-p.axadmin.net: Timed Out | call=81 key=fence_ao_pg02_monitor_60000 timeout=20000ms May 17 20:52:33 [127214] ao-pg01-p.axadmin.net cib: info: cib_process_request: Forwarding cib_modify operation for section status to all (origin=local/crmd/210) May 17 20:52:33 [127214] ao-pg01-p.axadmin.net cib: info: cib_perform_op: Diff: --- 0.36.110 2 May 17 20:52:33 [127214] ao-pg01-p.axadmin.net cib: info: cib_perform_op: Diff: +++ 0.36.111 (null) May 17 20:52:33 [127214] ao-pg01-p.axadmin.net cib: info: cib_perform_op: + /cib: @num_updates=111 May 17 20:52:33 [127214] ao-pg01-p.axadmin.net cib: info: cib_perform_op: + /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='fence_ao_pg02']/lrm_rsc_op[@id='fence_ao_pg02_last_failure_0']: @transition-key=3:41:0:5ee48ba5-e614-43dd-890f-3f930f78ce44, @transition-magic=2:1;3:41:0:5ee48ba5-e614-43dd-890f-3f930f78ce44, @call-id=81, @last-rc-change=1558119133, @exec-time=20042 May 17 20:52:33 [127214] ao-pg01-p.axadmin.net cib: info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=ao-pg01-p.axadmin.net/crmd/210, version=0.36.111) May 17 20:52:33 [127215] ao-pg01-p.axadmin.net stonith-ng: info: update_cib_stonith_devices_v2: Updating device list from the cib: modify lrm_rsc_op[@id='fence_ao_pg02_last_failure_0'] May 17 20:52:33 [127215] ao-pg01-p.axadmin.net stonith-ng: info: cib_devices_update: Updating devices to version 0.36.111 May 17 20:52:33 [127219] ao-pg01-p.axadmin.net crmd: info: abort_transition_graph: Transition aborted by operation fence_ao_pg02_monitor_60000 'modify' on ao-pg01-p.axadmin.net: Old event | magic=2:1;3:41:0:5ee48ba5-e614-43dd-890f-3f930f78ce44 cib=0.36.111 source=process_graph_event:499 complete=true May 17 20:52:33 [127219] ao-pg01-p.axadmin.net crmd: info: update_failcount: Updating failcount for fence_ao_pg02 on ao-pg01-p.axadmin.net after failed monitor: rc=1 (update=value++, time=1558119153) May 17 20:52:33 [127219] ao-pg01-p.axadmin.net crmd: info: process_graph_event: Detected action (41.3) fence_ao_pg02_monitor_60000.81=unknown error: failed ... ... May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: info: determine_online_status_fencing: Node ao-pg02-p.axadmin.net is active May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: info: determine_online_status: Node ao-pg02-p.axadmin.net is online May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: info: determine_online_status_fencing: Node ao-pg01-p.axadmin.net is active May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: info: determine_online_status: Node ao-pg01-p.axadmin.net is online May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: warning: unpack_rsc_op_failure: Processing failed monitor of fence_ao_pg02 on ao-pg01-p.axadmin.net: unknown error | rc=1 May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: info: unpack_node_loop: Node 2 is already processed May 17 20:52:33 [127219] ao-pg01-p.axadmin.net crmd: info: abort_transition_graph: Transition aborted by status-1-fail-count-fence_ao_pg02.monitor_60000 doing modify fail-count-fence_ao_pg02#monitor_60000=13: Transient attribute change | cib=0.36.112 source=abort_unless_down:341 path=/cib/status/node_state[@id='1']/transient_attributes[@id='1']/instance_attributes[@id='status-1']/nvpair[@id='status-1-fail-count-fence_ao_pg02.monitor_60000'] complete=true ... .... May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: info: common_print: ao-cl-p-01-vip01 (ocf::heartbeat:IPaddr2): Started ao-pg01-p.axadmin.net May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: info: common_print: fence_ao_pg01 (stonith:fence_vmware_soap): Started ao-pg02-p.axadmin.net May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: info: common_print: fence_ao_pg02 (stonith:fence_vmware_soap): FAILED ao-pg01-p.axadmin.net May 17 20:52:33 [127219] ao-pg01-p.axadmin.net crmd: info: abort_transition_graph: Transition aborted by status-1-last-failure-fence_ao_pg02.monitor_60000 doing modify last-failure-fence_ao_pg02#monitor_60000=1558119153: Transient attribute change | cib=0.36.113 source=abort_unless_down:341 path=/cib/status/node_state[@id='1']/transient_attributes[@id='1']/instance_attributes[@id='status-1']/nvpair[@id='status-1-last-failure-fence_ao_pg02.monitor_60000'] complete=true May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: info: pe_get_failcount: fence_ao_pg02 has failed 12 times on ao-pg01-p.axadmin.net May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: info: check_migration_threshold: fence_ao_pg02 can fail 999988 more times on ao-pg01-p.axadmin.net before being forced off May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: info: RecurringOp: Start recurring monitor (60s) for fence_ao_pg02 on ao-pg01-p.axadmin.net May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: info: LogActions: Leave ao-cl-p-01-vip01 (Started ao-pg01-p.axadmin.net) May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: info: LogActions: Leave fence_ao_pg01 (Started ao-pg02-p.axadmin.net) May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: notice: LogAction: * Recover fence_ao_pg02 ( ao-pg01-p.axadmin.net ) May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: notice: process_pe_message: Calculated transition 43, saving inputs in /var/lib/pacemaker/pengine/pe-input-280.bz2 May 17 20:52:33 [127219] ao-pg01-p.axadmin.net crmd: info: handle_response: pe_calc calculation pe_calc-dc-1558119153-115 is obsolete May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: notice: unpack_config: On loss of CCM Quorum: Ignore May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: info: determine_online_status_fencing: Node ao-pg02-p.axadmin.net is active May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: info: determine_online_status: Node ao-pg02-p.axadmin.net is online May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: info: determine_online_status_fencing: Node ao-pg01-p.axadmin.net is active May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: info: determine_online_status: Node ao-pg01-p.axadmin.net is online May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: warning: unpack_rsc_op_failure: Processing failed monitor of fence_ao_pg02 on ao-pg01-p.axadmin.net: unknown error | rc=1 ... ... May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: info: common_print: ao-cl-p-01-vip01 (ocf::heartbeat:IPaddr2): Started ao-pg01-p.axadmin.net May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: info: common_print: fence_ao_pg01 (stonith:fence_vmware_soap): Started ao-pg02-p.axadmin.net May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: info: common_print: fence_ao_pg02 (stonith:fence_vmware_soap): FAILED ao-pg01-p.axadmin.net May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: info: pe_get_failcount: fence_ao_pg02 has failed 13 times on ao-pg01-p.axadmin.net May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: info: check_migration_threshold: fence_ao_pg02 can fail 999987 more times on ao-pg01-p.axadmin.net before being forced off May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: info: RecurringOp: Start recurring monitor (60s) for fence_ao_pg02 on ao-pg01-p.axadmin.net May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: info: LogActions: Leave ao-cl-p-01-vip01 (Started ao-pg01-p.axadmin.net) May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: info: LogActions: Leave fence_ao_pg01 (Started ao-pg02-p.axadmin.net) May 17 20:52:33 [127218] ao-pg01-p.axadmin.net pengine: notice: LogAction: * Recover fence_ao_pg02 ( ao-pg01-p.axadmin.net ) ... ... May 17 20:52:33 [127219] ao-pg01-p.axadmin.net crmd: info: process_lrm_event: Result of monitor operation for fence_ao_pg02 on ao-pg01-p.axadmin.net: Cancelled | call=81 key=fence_ao_pg02_monitor_60000 confirmed=true May 17 20:52:33 [127214] ao-pg01-p.axadmin.net cib: info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=ao-pg01-p.axadmin.net/crmd/214, version=0.36.114) May 17 20:52:33 [127215] ao-pg01-p.axadmin.net stonith-ng: info: update_cib_stonith_devices_v2: Updating device list from the cib: modify lrm_rsc_op[@id='fence_ao_pg02_last_0'] May 17 20:52:33 [127215] ao-pg01-p.axadmin.net stonith-ng: info: cib_devices_update: Updating devices to version 0.36.114 May 17 20:52:33 [127219] ao-pg01-p.axadmin.net crmd: notice: process_lrm_event: Result of stop operation for fence_ao_pg02 on ao-pg01-p.axadmin.net: 0 (ok) | call=83 key=fence_ao_pg02_stop_0 confirmed=true cib-update=215 May 17 20:52:33 [127215] ao-pg01-p.axadmin.net stonith-ng: notice: unpack_config: On loss of CCM Quorum: Ignore May 17 20:52:33 [127214] ao-pg01-p.axadmin.net cib: info: cib_process_request: Forwarding cib_modify operation for section status to all (origin=local/crmd/215) May 17 20:52:33 [127215] ao-pg01-p.axadmin.net stonith-ng: info: cib_device_update: Device fence_ao_pg01 has been disabled on ao-pg01-p.axadmin.net: score=-INFINITY ... ...