Nov 9 14:51:06 ip-10-50-3-251 lrmd[944]: notice: operation_finished: ClusterEIP_54.215.143.166_monitor_5000:29139 [ 2013/11/09_14:51:06 INFO: 54.215.143.166 is here ] Nov 9 14:51:17 ip-10-50-3-251 lrmd[944]: notice: operation_finished: ClusterEIP_54.215.143.166_monitor_5000:29278 [ 2013/11/09_14:51:17 INFO: 54.215.143.166 is here ] Nov 9 14:51:33 ip-10-50-3-251 corosync[640]: [TOTEM ] A processor failed, forming new configuration. Nov 9 14:51:38 ip-10-50-3-251 corosync[640]: [CMAN ] quorum lost, blocking activity Nov 9 14:51:38 ip-10-50-3-251 corosync[640]: [QUORUM] This node is within the non-primary component and will NOT provide any services. Nov 9 14:51:38 ip-10-50-3-251 corosync[640]: [QUORUM] Members[1]: 2 Nov 9 14:51:38 ip-10-50-3-251 corosync[640]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Nov 9 14:51:38 ip-10-50-3-251 crmd[947]: notice: cman_event_callback: Membership 3004136: quorum lost Nov 9 14:51:38 ip-10-50-3-251 corosync[640]: [CPG ] chosen downlist: sender r(0) ip(10.50.3.251) ; members(old:2 left:1) Nov 9 14:51:38 ip-10-50-3-251 corosync[640]: [MAIN ] Completed service synchronization, ready to provide service. Nov 9 14:51:38 ip-10-50-3-251 kernel: dlm: closing connection to node 1 Nov 9 14:51:38 ip-10-50-3-251 crmd[947]: notice: crm_update_peer_state: cman_event_callback: Node ip-10-50-3-122[1] - state is now lost Nov 9 14:51:38 ip-10-50-3-251 crmd[947]: warning: check_dead_member: Our DC node (ip-10-50-3-122) left the cluster Nov 9 14:51:38 ip-10-50-3-251 crmd[947]: notice: do_state_transition: State transition S_NOT_DC -> S_ELECTION [ input=I_ELECTION cause=C_FSA_INTERNAL origin=check_dead_member ] Nov 9 14:51:38 ip-10-50-3-251 crmd[947]: notice: do_state_transition: State transition S_ELECTION -> S_INTEGRATION [ input=I_ELECTION_DC cause=C_FSA_INTERNAL origin=do_election_check ] Nov 9 14:51:38 ip-10-50-3-251 attrd[945]: notice: attrd_local_callback: Sending full refresh (origin=crmd) Nov 9 14:51:38 ip-10-50-3-251 attrd[945]: notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true) Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: notice: unpack_config: On loss of CCM Quorum: Ignore Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: warning: pe_fence_node: Node ip-10-50-3-122 will be fenced because the node is no longer part of the cluster Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: warning: determine_online_status: Node ip-10-50-3-122 is unclean Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: crit: get_timet_now: Defaulting to 'now' Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: crit: get_timet_now: Defaulting to 'now' Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: crit: get_timet_now: Defaulting to 'now' Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: crit: get_timet_now: Defaulting to 'now' Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: crit: get_timet_now: Defaulting to 'now' Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: crit: get_timet_now: Defaulting to 'now' Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: crit: get_timet_now: Defaulting to 'now' Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: crit: get_timet_now: Defaulting to 'now' Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: crit: get_timet_now: Defaulting to 'now' Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: crit: get_timet_now: Defaulting to 'now' Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: crit: get_timet_now: Defaulting to 'now' Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: crit: get_timet_now: Defaulting to 'now' Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: crit: get_timet_now: Defaulting to 'now' Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: crit: get_timet_now: Defaulting to 'now' Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: crit: get_timet_now: Defaulting to 'now' Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: crit: get_timet_now: Defaulting to 'now' Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: crit: get_timet_now: Defaulting to 'now' Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: warning: custom_action: Action Varnish:1_stop_0 on ip-10-50-3-122 is unrunnable (offline) Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: warning: custom_action: Action Varnish:1_stop_0 on ip-10-50-3-122 is unrunnable (offline) Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: warning: custom_action: Action Varnishlog:1_stop_0 on ip-10-50-3-122 is unrunnable (offline) Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: warning: custom_action: Action Varnishlog:1_stop_0 on ip-10-50-3-122 is unrunnable (offline) Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: warning: custom_action: Action Varnishncsa:1_stop_0 on ip-10-50-3-122 is unrunnable (offline) Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: warning: custom_action: Action Varnishncsa:1_stop_0 on ip-10-50-3-122 is unrunnable (offline) Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: warning: custom_action: Action ec2-fencing_stop_0 on ip-10-50-3-122 is unrunnable (offline) Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: warning: stage6: Scheduling Node ip-10-50-3-122 for STONITH Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: notice: LogActions: Stop Varnish:1#011(ip-10-50-3-122) Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: notice: LogActions: Stop Varnishlog:1#011(ip-10-50-3-122) Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: notice: LogActions: Stop Varnishncsa:1#011(ip-10-50-3-122) Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: notice: LogActions: Move ec2-fencing#011(Started ip-10-50-3-122 -> ip-10-50-3-251) Nov 9 14:51:39 ip-10-50-3-251 crmd[947]: notice: te_fence_node: Executing reboot fencing operation (34) on ip-10-50-3-122 (timeout=60000) Nov 9 14:51:39 ip-10-50-3-251 stonith-ng[943]: notice: handle_request: Client crmd.947.c5e50058 wants to fence (reboot) 'ip-10-50-3-122' with device '(any)' Nov 9 14:51:39 ip-10-50-3-251 stonith-ng[943]: notice: initiate_remote_stonith_op: Initiating remote operation reboot for ip-10-50-3-122: 73629a19-c784-4951-a48c-5ec37137cc06 (0) Nov 9 14:51:39 ip-10-50-3-251 pengine[946]: warning: process_pe_message: Calculated Transition 0: /var/lib/pacemaker/pengine/pe-warn-5.bz2 Nov 9 14:51:43 ip-10-50-3-251 corosync[640]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Nov 9 14:51:43 ip-10-50-3-251 corosync[640]: [CMAN ] quorum regained, resuming activity Nov 9 14:51:43 ip-10-50-3-251 corosync[640]: [QUORUM] This node is within the primary component and will provide service. Nov 9 14:51:43 ip-10-50-3-251 corosync[640]: [QUORUM] Members[2]: 1 2 Nov 9 14:51:43 ip-10-50-3-251 corosync[640]: [QUORUM] Members[2]: 1 2 Nov 9 14:51:43 ip-10-50-3-251 crmd[947]: notice: cman_event_callback: Membership 3004140: quorum acquired Nov 9 14:51:43 ip-10-50-3-251 crmd[947]: notice: crm_update_peer_state: cman_event_callback: Node ip-10-50-3-122[1] - state is now member Nov 9 14:51:43 ip-10-50-3-251 corosync[640]: [CPG ] chosen downlist: sender r(0) ip(10.50.3.251) ; members(old:1 left:0) Nov 9 14:51:43 ip-10-50-3-251 corosync[640]: [MAIN ] Completed service synchronization, ready to provide service. Nov 9 14:51:43 ip-10-50-3-251 fenced[696]: receive_start 1:11 add node with started_count 5 Nov 9 14:51:43 ip-10-50-3-251 corosync[640]: cman killed by node 1 because we were killed by cman_tool or other application Nov 9 14:51:43 ip-10-50-3-251 dlm_controld[709]: cluster is down, exiting Nov 9 14:51:43 ip-10-50-3-251 attrd[945]: error: pcmk_cpg_dispatch: Connection to the CPG API failed: 2 Nov 9 14:51:43 ip-10-50-3-251 dlm_controld[709]: daemon cpg_dispatch error 2 Nov 9 14:51:43 ip-10-50-3-251 attrd[945]: crit: attrd_ais_destroy: Lost connection to Corosync service! Nov 9 14:51:43 ip-10-50-3-251 attrd[945]: notice: main: Exiting... Nov 9 14:51:43 ip-10-50-3-251 attrd[945]: notice: main: Disconnecting client 0x161d410, pid=947... Nov 9 14:51:43 ip-10-50-3-251 gfs_controld[772]: cluster is down, exiting Nov 9 14:51:43 ip-10-50-3-251 attrd[945]: error: attrd_cib_connection_destroy: Connection to the CIB terminated... Nov 9 14:51:43 ip-10-50-3-251 gfs_controld[772]: daemon cpg_dispatch error 2 Nov 9 14:51:43 ip-10-50-3-251 fenced[696]: cluster is down, exiting Nov 9 14:51:43 ip-10-50-3-251 fenced[696]: daemon cpg_dispatch error 2 Nov 9 14:51:43 ip-10-50-3-251 fenced[696]: cpg_dispatch error 2 Nov 9 14:51:43 ip-10-50-3-251 stonith-ng[943]: error: pcmk_cpg_dispatch: Connection to the CPG API failed: 2 Nov 9 14:51:43 ip-10-50-3-251 stonith-ng[943]: error: stonith_peer_ais_destroy: AIS connection terminated Nov 9 14:51:43 ip-10-50-3-251 lrmd[944]: error: crm_ipc_read: Connection to stonith-ng failed Nov 9 14:51:43 ip-10-50-3-251 lrmd[944]: error: mainloop_gio_callback: Connection to stonith-ng[0xd17db0] closed (I/O condition=17) Nov 9 14:51:43 ip-10-50-3-251 lrmd[944]: error: stonith_connection_destroy_cb: LRMD lost STONITH connection Nov 9 14:51:44 ip-10-50-3-251 lrmd[944]: notice: operation_finished: ClusterEIP_54.215.143.166_monitor_5000:29399 [ 2013/11/09_14:51:44 INFO: 54.215.143.166 is here ] Nov 9 14:51:45 ip-10-50-3-251 pacemakerd[936]: error: send_cpg_message: Sending message via cpg FAILED: (rc=2) Library error Nov 9 14:51:45 ip-10-50-3-251 pacemakerd[936]: error: cpg_connection_destroy: Connection destroyed Nov 9 14:51:45 ip-10-50-3-251 pacemakerd[936]: error: cfg_connection_destroy: Connection destroyed Nov 9 14:51:45 ip-10-50-3-251 pacemakerd[936]: error: send_cpg_message: Sending message via cpg FAILED: (rc=9) Bad handle Nov 9 14:51:45 ip-10-50-3-251 pacemakerd[936]: error: pcmk_child_exit: Child process attrd exited (pid=945, rc=1) Nov 9 14:51:45 ip-10-50-3-251 pacemakerd[936]: error: send_cpg_message: Sending message via cpg FAILED: (rc=9) Bad handle Nov 9 14:51:45 ip-10-50-3-251 pacemakerd[936]: notice: pcmk_shutdown_worker: Shuting down Pacemaker Nov 9 14:51:45 ip-10-50-3-251 pacemakerd[936]: notice: stop_child: Stopping crmd: Sent -15 to process 947 Nov 9 14:51:45 ip-10-50-3-251 kernel: dlm: closing connection to node 1 Nov 9 14:51:45 ip-10-50-3-251 kernel: dlm: closing connection to node 2 Nov 9 14:51:46 ip-10-50-3-251 cib[942]: error: send_ais_text: Sending message 26 via cpg: FAILED (rc=2): Library error: Connection timed out (110) Nov 9 14:51:47 ip-10-50-3-251 crmd[947]: error: send_ais_text: Sending message 14 via cpg: FAILED (rc=2): Library error: Connection timed out (110) Nov 9 14:51:47 ip-10-50-3-251 crmd[947]: error: cman_event_callback: Couldn't query cman cluster details: -1 112 Nov 9 14:51:47 ip-10-50-3-251 crmd[947]: error: pcmk_cman_dispatch: Connection to cman failed: -1 Nov 9 14:51:47 ip-10-50-3-251 crmd[947]: notice: crm_shutdown: Requesting shutdown, upper limit is 1200000ms Nov 9 14:51:47 ip-10-50-3-251 crmd[947]: warning: do_log: FSA: Input I_SHUTDOWN from crm_shutdown() received in state S_TRANSITION_ENGINE Nov 9 14:51:48 ip-10-50-3-251 cib[942]: error: send_ais_text: Sending message 27 via cpg: FAILED (rc=2): Library error: Connection timed out (110) Nov 9 14:51:49 ip-10-50-3-251 crmd[947]: error: send_ais_text: Sending message 15 via cpg: FAILED (rc=2): Library error: Connection timed out (110) Nov 9 14:51:49 ip-10-50-3-251 crmd[947]: error: do_log: FSA: Input I_ERROR from do_shutdown_req() received in state S_POLICY_ENGINE Nov 9 14:51:49 ip-10-50-3-251 crmd[947]: warning: do_state_transition: State transition S_POLICY_ENGINE -> S_RECOVERY [ input=I_ERROR cause=C_FSA_INTERNAL origin=do_shutdown_req ] Nov 9 14:51:49 ip-10-50-3-251 crmd[947]: error: do_recover: Action A_RECOVER (0000000001000000) not supported Nov 9 14:51:49 ip-10-50-3-251 crmd[947]: warning: do_election_vote: Not voting in election, we're in state S_RECOVERY Nov 9 14:51:50 ip-10-50-3-251 cib[942]: error: send_ais_text: Sending message 28 via cpg: FAILED (rc=2): Library error: Connection timed out (110) Nov 9 14:51:50 ip-10-50-3-251 cib[942]: error: pcmk_cpg_dispatch: Connection to the CPG API failed: 2 Nov 9 14:51:50 ip-10-50-3-251 cib[942]: error: cib_ais_destroy: Corosync connection lost! Exiting. Nov 9 14:51:50 ip-10-50-3-251 pacemakerd[936]: error: pcmk_child_exit: Child process cib exited (pid=942, rc=64) Nov 9 14:51:50 ip-10-50-3-251 pacemakerd[936]: error: send_cpg_message: Sending message via cpg FAILED: (rc=9) Bad handle Nov 9 14:51:50 ip-10-50-3-251 crmd[947]: error: internal_ipc_get_reply: Server disconnected client cib_shm while waiting for msg id 100 Nov 9 14:51:50 ip-10-50-3-251 crmd[947]: notice: crm_ipc_send: Connection to cib_shm closed: Transport endpoint is not connected (-107) Nov 9 14:51:50 ip-10-50-3-251 crmd[947]: error: do_log: FSA: Input I_TERMINATE from do_recover() received in state S_RECOVERY Nov 9 14:51:51 ip-10-50-3-251 crmd[947]: notice: terminate_cs_connection: Disconnecting from Corosync Nov 9 14:51:53 ip-10-50-3-251 crmd[947]: notice: crm_ipc_send: Connection to cib_shm closed Nov 9 14:51:53 ip-10-50-3-251 crmd[947]: notice: crm_ipc_send: Connection to cib_shm closed Nov 9 14:51:53 ip-10-50-3-251 crmd[947]: error: cib_native_perform_op_delegate: Couldn't perform cib_slave operation (timeout=120s): -107: Connection timed out (110) Nov 9 14:51:53 ip-10-50-3-251 crmd[947]: error: cib_native_perform_op_delegate: CIB disconnected Nov 9 14:51:53 ip-10-50-3-251 crmd[947]: error: do_exit: Could not recover from internal error Nov 9 14:51:53 ip-10-50-3-251 pacemakerd[936]: error: pcmk_child_exit: Child process crmd exited (pid=947, rc=2) Nov 9 14:51:53 ip-10-50-3-251 pacemakerd[936]: error: send_cpg_message: Sending message via cpg FAILED: (rc=9) Bad handle Nov 9 14:51:53 ip-10-50-3-251 pacemakerd[936]: notice: stop_child: Stopping pengine: Sent -15 to process 946 Nov 9 14:51:53 ip-10-50-3-251 pacemakerd[936]: error: send_cpg_message: Sending message via cpg FAILED: (rc=9) Bad handle Nov 9 14:51:53 ip-10-50-3-251 pacemakerd[936]: notice: stop_child: Stopping lrmd: Sent -15 to process 944 Nov 9 14:51:53 ip-10-50-3-251 pacemakerd[936]: error: send_cpg_message: Sending message via cpg FAILED: (rc=9) Bad handle Nov 9 14:51:53 ip-10-50-3-251 pacemakerd[936]: notice: pcmk_shutdown_worker: Shutdown complete