[ClusterLabs] GlusterFS all apache instances stopped after failover
lukas
lukas.kostyan at gmail.com
Thu May 7 11:23:50 UTC 2015
Hi,
have a config with 2 nodes where glusterfs is being mounted to root dir
of apache. After producing a kernel panic by echo c >
/proc/sysrq-trigger on node2, node 2 is turned off by stonith. ping
should work further on node1 but it stops since apache is stopped on
node1 as well. So why is apache stopped on node1, has someone any ideas.
Below the logs
following setup: pacemaker 1.1.7, corosync 1.4
Online: [ vm-1 vm-2 ]
Clone Set: cl_gluster_mnt [p_gluster_mnt]
Started: [ vm-1 vm-2 ]
Clone Set: cl_apache [p_apache]
Started: [ vm-1 vm-2 ]
Clone Set: cl_IP [IP] (unique)
IP:0 (ocf::heartbeat:IPaddr2): Started vm-1
IP:1 (ocf::heartbeat:IPaddr2): Started vm-2
p_fence_N1 (stonith:external/libvirt): Started vm-2
p_fence_N2 (stonith:external/libvirt): Started vm-1
root at vm-1:~# crm configure show
node vm-1 \
attributes standby="off"
node vm-2 \
attributes standby="off"
primitive IP ocf:heartbeat:IPaddr2 \
params ip="192.168.122.200" nic="eth0"
clusterip_hash="sourceip-sourceport" \
op monitor interval="10s"
primitive p_apache ocf:heartbeat:apache \
params configfile="/etc/apache2/apache2.conf"
statusurl="http://localhost/server-status" \
op monitor interval="60" timeout="20" \
op start interval="0" timeout="40s" start-delay="0" \
meta is-managed="true"
primitive p_fence_N1 stonith:external/libvirt \
params hostlist="vm-1:N1"
hypervisor_uri="qemu+tcp://192.168.122.1/system" pcmk_reboot_action="off" \
op monitor interval="60" \
meta target-role="Started"
primitive p_fence_N2 stonith:external/libvirt \
params hostlist="vm-2:N2"
hypervisor_uri="qemu+tcp://192.168.122.1/system" pcmk_reboot_action="off" \
op monitor interval="60"
primitive p_gluster_mnt ocf:heartbeat:Filesystem \
params device="localhost:/gvolrep" directory="/var/www/html"
fstype="glusterfs" \
op monitor interval="10"
clone cl_IP IP \
meta globally-unique="true" clone-max="2" clone-node-max="2" \
params resource-stickiness="0"
clone cl_apache p_apache \
meta target-role="Started"
clone cl_gluster_mnt p_gluster_mnt \
meta target-role="Started"
location l_fence_N1 p_fence_N1 -inf: vm-1
location l_fence_N2 p_fence_N2 -inf: vm-2
colocation c_apache_gluster inf: cl_IP cl_gluster_mnt
colocation c_ip_apache inf: cl_apache cl_IP
order o_apache inf: cl_gluster_mnt cl_IP cl_apache
property $id="cib-bootstrap-options" \
dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="true" \
no-quorum-policy="ignore" \
last-lrm-refresh="1430996556"
rsc_defaults $id="rsc-options" \
resource-stickiness="100"
op_defaults $id="op-options" \
timeout="240s"
############################
root at vm-1:~# tail -f /var/log/syslog
May 7 13:19:58 vm-1 crmd: [3039]: info: te_rsc_command: Initiating
action 27: monitor p_apache:1_monitor_60000 on vm-2
May 7 13:19:58 vm-1 lrmd: [3036]: info: operation monitor[54] on
p_apache:0 for client 3039: pid 27310 exited with return code 0
May 7 13:19:58 vm-1 crmd: [3039]: info: process_lrm_event: LRM
operation p_apache:0_monitor_60000 (call=54, rc=0, cib-update=322,
confirmed=false) ok
May 7 13:19:58 vm-1 crmd: [3039]: notice: run_graph: ==== Transition 30
(Complete=34, Pending=0, Fired=0, Skipped=0, Incomplete=0,
Source=/var/lib/pengine/pe-input-1906.bz2): Complete
May 7 13:19:58 vm-1 crmd: [3039]: notice: do_state_transition: State
transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
cause=C_FSA_INTERNAL origin=notify_crmd ]
May 7 13:20:12 vm-1 stonith-ng: [3035]: info: stonith_command:
Processed st_execute from lrmd: rc=-1
May 7 13:20:12 vm-1 external/libvirt[27429]: [27444]: notice:
qemu+tcp://192.168.122.1/system: Running hypervisor: QEMU 2.0.0
May 7 13:20:13 vm-1 stonith: [27422]: info: external/libvirt device OK.
May 7 13:20:13 vm-1 stonith-ng: [3035]: info: log_operation:
p_fence_N2: Performing: stonith -t external/libvirt -S
May 7 13:20:13 vm-1 stonith-ng: [3035]: info: log_operation:
p_fence_N2: success: 0
May 7 13:21:08 vm-1 corosync[2971]: [TOTEM ] A processor failed,
forming new configuration.
May 7 13:21:12 vm-1 corosync[2971]: [pcmk ] notice:
pcmk_peer_update: Transitional membership event on ring 7024: memb=1,
new=0, lost=1
May 7 13:21:12 vm-1 corosync[2971]: [pcmk ] info: pcmk_peer_update:
memb: vm-1 2138745024
May 7 13:21:12 vm-1 corosync[2971]: [pcmk ] info: pcmk_peer_update:
lost: vm-2 930785472
May 7 13:21:12 vm-1 corosync[2971]: [pcmk ] notice:
pcmk_peer_update: Stable membership event on ring 7024: memb=1, new=0,
lost=0
May 7 13:21:12 vm-1 corosync[2971]: [pcmk ] info: pcmk_peer_update:
MEMB: vm-1 2138745024
May 7 13:21:12 vm-1 corosync[2971]: [pcmk ] info:
ais_mark_unseen_peer_dead: Node vm-2 was not seen in the previous transition
May 7 13:21:12 vm-1 corosync[2971]: [pcmk ] info: update_member:
Node 930785472/vm-2 is now: lost
May 7 13:21:12 vm-1 corosync[2971]: [pcmk ] info:
send_member_notification: Sending membership update 7024 to 2 children
May 7 13:21:12 vm-1 cib: [3034]: notice: ais_dispatch_message:
Membership 7024: quorum lost
May 7 13:21:12 vm-1 cib: [3034]: info: crm_update_peer: Node vm-2:
id=930785472 state=lost (new) addr=r(0) ip(192.168.122.183) votes=1
born=7020 seen=7020 proc=00000000000000000000000000111312
May 7 13:21:12 vm-1 crmd: [3039]: notice: ais_dispatch_message:
Membership 7024: quorum lost
May 7 13:21:12 vm-1 crmd: [3039]: info: ais_status_callback: status:
vm-2 is now lost (was member)
May 7 13:21:12 vm-1 corosync[2971]: [TOTEM ] A processor joined or
left the membership and a new membership was formed.
May 7 13:21:12 vm-1 crmd: [3039]: info: crm_update_peer: Node vm-2:
id=930785472 state=lost (new) addr=r(0) ip(192.168.122.183) votes=1
born=7020 seen=7020 proc=00000000000000000000000000111312
May 7 13:21:12 vm-1 corosync[2971]: [CPG ] chosen downlist: sender
r(0) ip(192.168.122.127) ; members(old:2 left:1)
May 7 13:21:12 vm-1 cib: [3034]: info: cib_process_request: Operation
complete: op cib_modify for section nodes (origin=local/crmd/323,
version=0.502.55): ok (rc=0)
May 7 13:21:12 vm-1 corosync[2971]: [MAIN ] Completed service
synchronization, ready to provide service.
May 7 13:21:12 vm-1 cib: [3034]: info: cib_process_request: Operation
complete: op cib_modify for section cib (origin=local/crmd/325,
version=0.502.57): ok (rc=0)
May 7 13:21:12 vm-1 crmd: [3039]: info: crmd_ais_dispatch: Setting
expected votes to 2
May 7 13:21:12 vm-1 crmd: [3039]: WARN: match_down_event: No match for
shutdown action on vm-2
May 7 13:21:12 vm-1 crmd: [3039]: info: te_update_diff:
Stonith/shutdown of vm-2 not matched
May 7 13:21:12 vm-1 crmd: [3039]: info: abort_transition_graph:
te_update_diff:234 - Triggered transition abort (complete=1,
tag=node_state, id=vm-2, magic=NA, cib=0.502.56) : Node failure
May 7 13:21:12 vm-1 crmd: [3039]: notice: do_state_transition: State
transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC
cause=C_FSA_INTERNAL origin=abort_transition_graph ]
May 7 13:21:12 vm-1 cib: [3034]: info: cib_process_request: Operation
complete: op cib_modify for section crm_config (origin=local/crmd/327,
version=0.502.58): ok (rc=0)
May 7 13:21:12 vm-1 pengine: [3038]: notice: unpack_config: On loss of
CCM Quorum: Ignore
May 7 13:21:12 vm-1 pengine: [3038]: WARN: pe_fence_node: Node vm-2
will be fenced because it is un-expectedly down
May 7 13:21:12 vm-1 pengine: [3038]: WARN: determine_online_status:
Node vm-2 is unclean
May 7 13:21:12 vm-1 pengine: [3038]: WARN: custom_action: Action
p_gluster_mnt:1_stop_0 on vm-2 is unrunnable (offline)
May 7 13:21:12 vm-1 pengine: [3038]: WARN: custom_action: Marking node
vm-2 unclean
May 7 13:21:12 vm-1 pengine: [3038]: WARN: custom_action: Action
p_apache:1_stop_0 on vm-2 is unrunnable (offline)
May 7 13:21:12 vm-1 pengine: [3038]: WARN: custom_action: Marking node
vm-2 unclean
May 7 13:21:12 vm-1 pengine: [3038]: WARN: custom_action: Action
IP:1_stop_0 on vm-2 is unrunnable (offline)
May 7 13:21:12 vm-1 pengine: [3038]: WARN: custom_action: Marking node
vm-2 unclean
May 7 13:21:12 vm-1 pengine: [3038]: WARN: custom_action: Action
p_fence_N1_stop_0 on vm-2 is unrunnable (offline)
May 7 13:21:12 vm-1 pengine: [3038]: WARN: custom_action: Marking node
vm-2 unclean
May 7 13:21:12 vm-1 pengine: [3038]: WARN: stage6: Scheduling Node vm-2
for STONITH
May 7 13:21:12 vm-1 pengine: [3038]: notice: LogActions: Stop
p_gluster_mnt:1#011(vm-2)
May 7 13:21:12 vm-1 pengine: [3038]: notice: LogActions: Restart
p_apache:0#011(Started vm-1)
May 7 13:21:12 vm-1 pengine: [3038]: notice: LogActions: Stop
p_apache:1#011(vm-2)
May 7 13:21:12 vm-1 pengine: [3038]: notice: LogActions: Move
IP:1#011(Started vm-2 -> vm-1)
May 7 13:21:12 vm-1 pengine: [3038]: notice: LogActions: Stop
p_fence_N1#011(vm-2)
May 7 13:21:12 vm-1 crmd: [3039]: notice: do_state_transition: State
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS
cause=C_IPC_MESSAGE origin=handle_response ]
May 7 13:21:12 vm-1 crmd: [3039]: info: do_te_invoke: Processing graph
31 (ref=pe_calc-dc-1430997672-228) derived from
/var/lib/pengine/pe-warn-661.bz2
May 7 13:21:12 vm-1 crmd: [3039]: notice: te_fence_node: Executing
reboot fencing operation (36) on vm-2 (timeout=60000)
May 7 13:21:12 vm-1 stonith-ng: [3035]: info:
initiate_remote_stonith_op: Initiating remote operation reboot for vm-2:
9a25d1de-b42d-44b1-9d49-368f36fa57fd
May 7 13:21:12 vm-1 pengine: [3038]: WARN: process_pe_message:
Transition 31: WARNINGs found during PE processing. PEngine Input stored
in: /var/lib/pengine/pe-warn-661.bz2
May 7 13:21:12 vm-1 pengine: [3038]: notice: process_pe_message:
Configuration WARNINGs found during PE processing. Please run
"crm_verify -L" to identify issues.
May 7 13:21:12 vm-1 stonith-ng: [3035]: info:
can_fence_host_with_device: Refreshing port list for p_fence_N2
May 7 13:21:12 vm-1 stonith-ng: [3035]: info:
can_fence_host_with_device: p_fence_N2 can fence vm-2: dynamic-list
May 7 13:21:12 vm-1 stonith-ng: [3035]: info: call_remote_stonith:
Requesting that vm-1 perform op reboot vm-2
May 7 13:21:12 vm-1 stonith-ng: [3035]: info:
can_fence_host_with_device: p_fence_N2 can fence vm-2: dynamic-list
May 7 13:21:12 vm-1 stonith-ng: [3035]: info: stonith_fence: Found 1
matching devices for 'vm-2'
May 7 13:21:12 vm-1 stonith-ng: [3035]: info: stonith_command:
Processed st_fence from vm-1: rc=-1
May 7 13:21:12 vm-1 stonith-ng: [3035]: info: make_args: Substituting
action 'off' for requested operation 'reboot'
May 7 13:21:12 vm-1 external/libvirt[27944]: [27957]: notice: Domain N2
was stopped
May 7 13:21:13 vm-1 stonith-ng: [3035]: notice: log_operation:
Operation 'reboot' [27936] (call 0 from
c5982dd8-7639-49be-a9f9-c8488bea6091) for host 'vm-2' with device
'p_fence_N2' returned: 0
May 7 13:21:13 vm-1 stonith-ng: [3035]: info: log_operation:
p_fence_N2: Performing: stonith -t external/libvirt -T off vm-2
May 7 13:21:13 vm-1 stonith-ng: [3035]: info: log_operation:
p_fence_N2: success: vm-2 0
May 7 13:21:13 vm-1 stonith-ng: [3035]: notice: remote_op_done:
Operation reboot of vm-2 by vm-1 for
vm-1[c5982dd8-7639-49be-a9f9-c8488bea6091]: OK
May 7 13:21:13 vm-1 crmd: [3039]: info: tengine_stonith_callback:
StonithOp <st-reply st_origin="stonith_construct_async_reply"
t="stonith-ng" st_op="reboot"
st_remote_op="9a25d1de-b42d-44b1-9d49-368f36fa57fd"
st_clientid="c5982dd8-7639-49be-a9f9-c8488bea6091" st_target="vm-2"
st_device_action="st_fence" st_callid="0" st_callopt="0" st_rc="0"
st_output="Performing: stonith -t external/libvirt -T off
vm-2#012success: vm-2 0#012" src="vm-1" seq="16" state="2" />
May 7 13:21:13 vm-1 crmd: [3039]: info: erase_status_tag: Deleting
xpath: //node_state[@uname='vm-2']/lrm
May 7 13:21:13 vm-1 crmd: [3039]: info: erase_status_tag: Deleting
xpath: //node_state[@uname='vm-2']/transient_attributes
May 7 13:21:13 vm-1 crmd: [3039]: notice: crmd_peer_update: Status
update: Client vm-2/crmd now has status [offline] (DC=true)
May 7 13:21:13 vm-1 crmd: [3039]: notice: tengine_stonith_notify: Peer
vm-2 was terminated (reboot) by vm-1 for vm-1: OK
(ref=9a25d1de-b42d-44b1-9d49-368f36fa57fd)
May 7 13:21:13 vm-1 crmd: [3039]: notice: do_state_transition: State
transition S_TRANSITION_ENGINE -> S_INTEGRATION [ input=I_NODE_JOIN
cause=C_FSA_INTERNAL origin=check_join_state ]
May 7 13:21:13 vm-1 crmd: [3039]: info: abort_transition_graph:
do_te_invoke:169 - Triggered transition abort (complete=0) : Peer Halt
May 7 13:21:13 vm-1 crmd: [3039]: notice: run_graph: ==== Transition 31
(Complete=3, Pending=0, Fired=0, Skipped=15, Incomplete=5,
Source=/var/lib/pengine/pe-warn-661.bz2): Stopped
May 7 13:21:13 vm-1 crmd: [3039]: info: abort_transition_graph:
do_te_invoke:169 - Triggered transition abort (complete=1) : Peer Halt
May 7 13:21:13 vm-1 crmd: [3039]: info: join_make_offer: Making join
offers based on membership 7024
May 7 13:21:13 vm-1 crmd: [3039]: info: do_dc_join_offer_all: join-11:
Waiting on 1 outstanding join acks
May 7 13:21:13 vm-1 crmd: [3039]: info: update_dc: Set DC to vm-1 (3.0.6)
May 7 13:21:13 vm-1 crmd: [3039]: info: cib_fencing_updated: Fencing
update 329 for vm-2: complete
May 7 13:21:13 vm-1 cib: [3034]: info: cib_process_request: Operation
complete: op cib_delete for section //node_state[@uname='vm-2']/lrm
(origin=local/crmd/330, version=0.502.60): ok (rc=0)
May 7 13:21:13 vm-1 cib: [3034]: info: cib_process_request: Operation
complete: op cib_delete for section
//node_state[@uname='vm-2']/transient_attributes (origin=local/crmd/331,
version=0.502.61): ok (rc=0)
May 7 13:21:13 vm-1 crmd: [3039]: notice: do_state_transition: State
transition S_INTEGRATION -> S_FINALIZE_JOIN [ input=I_INTEGRATED
cause=C_FSA_INTERNAL origin=check_join_state ]
May 7 13:21:13 vm-1 crmd: [3039]: info: do_dc_join_finalize: join-11:
Syncing the CIB from vm-1 to the rest of the cluster
May 7 13:21:13 vm-1 cib: [3034]: info: cib_process_request: Operation
complete: op cib_sync for section 'all' (origin=local/crmd/334,
version=0.502.62): ok (rc=0)
May 7 13:21:13 vm-1 cib: [3034]: info: cib_process_request: Operation
complete: op cib_modify for section nodes (origin=local/crmd/335,
version=0.502.63): ok (rc=0)
May 7 13:21:13 vm-1 crmd: [3039]: info: do_dc_join_ack: join-11:
Updating node state to member for vm-1
May 7 13:21:13 vm-1 crmd: [3039]: info: erase_status_tag: Deleting
xpath: //node_state[@uname='vm-1']/lrm
May 7 13:21:13 vm-1 cib: [3034]: info: cib_process_request: Operation
complete: op cib_delete for section //node_state[@uname='vm-1']/lrm
(origin=local/crmd/336, version=0.502.64): ok (rc=0)
May 7 13:21:13 vm-1 crmd: [3039]: notice: do_state_transition: State
transition S_FINALIZE_JOIN -> S_POLICY_ENGINE [ input=I_FINALIZED
cause=C_FSA_INTERNAL origin=check_join_state ]
May 7 13:21:13 vm-1 crmd: [3039]: info: abort_transition_graph:
do_te_invoke:162 - Triggered transition abort (complete=1) : Peer Cancelled
May 7 13:21:13 vm-1 attrd: [3037]: notice: attrd_local_callback:
Sending full refresh (origin=crmd)
May 7 13:21:13 vm-1 attrd: [3037]: notice: attrd_trigger_update:
Sending flush op to all hosts for: last-failure-p_apache:0 (1430996422)
May 7 13:21:13 vm-1 cib: [3034]: info: cib_process_request: Operation
complete: op cib_modify for section nodes (origin=local/crmd/338,
version=0.502.66): ok (rc=0)
May 7 13:21:13 vm-1 cib: [3034]: info: cib_process_request: Operation
complete: op cib_modify for section cib (origin=local/crmd/340,
version=0.502.68): ok (rc=0)
May 7 13:21:13 vm-1 attrd: [3037]: notice: attrd_trigger_update:
Sending flush op to all hosts for: probe_complete (true)
May 7 13:21:13 vm-1 pengine: [3038]: notice: unpack_config: On loss of
CCM Quorum: Ignore
May 7 13:21:13 vm-1 pengine: [3038]: notice: LogActions: Restart
p_apache:0#011(Started vm-1)
May 7 13:21:13 vm-1 pengine: [3038]: notice: LogActions: Start
IP:1#011(vm-1)
May 7 13:21:13 vm-1 crmd: [3039]: notice: do_state_transition: State
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS
cause=C_IPC_MESSAGE origin=handle_response ]
May 7 13:21:13 vm-1 crmd: [3039]: info: do_te_invoke: Processing graph
32 (ref=pe_calc-dc-1430997673-233) derived from
/var/lib/pengine/pe-input-1907.bz2
May 7 13:21:13 vm-1 crmd: [3039]: info: te_rsc_command: Initiating
action 14: stop p_apache:0_stop_0 on vm-1 (local)
May 7 13:21:13 vm-1 lrmd: [3036]: info: cancel_op: operation
monitor[54] on p_apache:0 for client 3039, its parameters:
CRM_meta_start_delay=[0] CRM_meta_timeout=[20000]
CRM_meta_name=[monitor] crm_feature_set=[3.0.6]
CRM_meta_clone_node_max=[1] configfile=[/etc/apache2/apache2.conf]
CRM_meta_clone=[0] CRM_meta_interval=[60000] CRM_meta_clone_max=[2]
CRM_meta_notify=[false] statusurl=[http://localhost/server-status]
CRM_meta_globally_unique=[false] cancelled
May 7 13:21:13 vm-1 lrmd: [3036]: info: rsc:p_apache:0 stop[55] (pid 27992)
May 7 13:21:13 vm-1 crmd: [3039]: info: te_rsc_command: Initiating
action 22: start IP:1_start_0 on vm-1 (local)
May 7 13:21:13 vm-1 lrmd: [3036]: info: rsc:IP:1 start[56] (pid 27993)
May 7 13:21:13 vm-1 crmd: [3039]: info: process_lrm_event: LRM
operation p_apache:0_monitor_60000 (call=54, status=1, cib-update=0,
confirmed=true) Cancelled
May 7 13:21:13 vm-1 pengine: [3038]: notice: process_pe_message:
Transition 32: PEngine Input stored in: /var/lib/pengine/pe-input-1907.bz2
May 7 13:21:13 vm-1 stonith-ng: [3035]: info: stonith_command:
Processed st_execute from lrmd: rc=-1
May 7 13:21:13 vm-1 external/libvirt[28069]: [28084]: notice:
qemu+tcp://192.168.122.1/system: Running hypervisor: QEMU 2.0.0
May 7 13:21:14 vm-1 IPaddr2[27993]: INFO: /usr/lib/heartbeat/send_arp
-i 200 -r 5 -p /var/run/resource-agents/send_arp-192.168.122.200 eth0
192.168.122.200 8bbd5f0b558f not_used not_used
May 7 13:21:14 vm-1 lrmd: [3036]: info: operation start[56] on IP:1 for
client 3039: pid 27993 exited with return code 0
May 7 13:21:14 vm-1 crmd: [3039]: info: process_lrm_event: LRM
operation IP:1_start_0 (call=56, rc=0, cib-update=342, confirmed=true) ok
May 7 13:21:14 vm-1 crmd: [3039]: info: te_rsc_command: Initiating
action 23: monitor IP:1_monitor_10000 on vm-1 (local)
May 7 13:21:14 vm-1 lrmd: [3036]: info: rsc:IP:1 monitor[57] (pid 28104)
May 7 13:21:14 vm-1 lrmd: [3036]: info: operation monitor[57] on IP:1
for client 3039: pid 28104 exited with return code 0
May 7 13:21:14 vm-1 crmd: [3039]: info: process_lrm_event: LRM
operation IP:1_monitor_10000 (call=57, rc=0, cib-update=343,
confirmed=false) ok
May 7 13:21:14 vm-1 lrmd: [3036]: info: RA output:
(p_apache:0:stop:stderr) /usr/lib/ocf/resource.d//heartbeat/apache: 440:
kill:
May 7 13:21:14 vm-1 lrmd: [3036]: info: RA output:
(p_apache:0:stop:stderr) No such process
May 7 13:21:14 vm-1 lrmd: [3036]: info: RA output:
(p_apache:0:stop:stderr)
May 7 13:21:14 vm-1 apache[27992]: INFO: Killing apache PID 27235
May 7 13:21:14 vm-1 apache[27992]: INFO: apache stopped.
May 7 13:21:14 vm-1 lrmd: [3036]: info: operation stop[55] on
p_apache:0 for client 3039: pid 27992 exited with return code 0
May 7 13:21:14 vm-1 crmd: [3039]: info: process_lrm_event: LRM
operation p_apache:0_stop_0 (call=55, rc=0, cib-update=344,
confirmed=true) ok
May 7 13:21:14 vm-1 crmd: [3039]: info: te_rsc_command: Initiating
action 15: start p_apache:0_start_0 on vm-1 (local)
May 7 13:21:14 vm-1 lrmd: [3036]: info: rsc:p_apache:0 start[58] (pid
28166)
May 7 13:21:14 vm-1 stonith: [28062]: info: external/libvirt device OK.
May 7 13:21:14 vm-1 stonith-ng: [3035]: info: log_operation:
p_fence_N2: Performing: stonith -t external/libvirt -S
May 7 13:21:14 vm-1 stonith-ng: [3035]: info: log_operation:
p_fence_N2: success: 0
May 7 13:21:54 vm-1 lrmd: [3036]: WARN: p_apache:0:start process (PID
28166) timed out (try 1). Killing with signal SIGTERM (15).
May 7 13:21:54 vm-1 lrmd: [3036]: WARN: operation start[58] on
p_apache:0 for client 3039: pid 28166 timed out
May 7 13:21:54 vm-1 crmd: [3039]: ERROR: process_lrm_event: LRM
operation p_apache:0_start_0 (58) Timed Out (timeout=40000ms)
May 7 13:21:54 vm-1 crmd: [3039]: WARN: status_from_rc: Action 15
(p_apache:0_start_0) on vm-1 failed (target: 0 vs. rc: -2): Error
May 7 13:21:54 vm-1 crmd: [3039]: WARN: update_failcount: Updating
failcount for p_apache:0 on vm-1 after failed start: rc=-2
(update=INFINITY, time=1430997714)
May 7 13:21:54 vm-1 crmd: [3039]: info: abort_transition_graph:
match_graph_event:277 - Triggered transition abort (complete=0,
tag=lrm_rsc_op, id=p_apache:0_last_failure_0,
magic=2:-2;15:32:0:da4997d2-70ea-45a7-9617-42963ea29e42, cib=0.502.74) :
Event failed
May 7 13:21:54 vm-1 crmd: [3039]: notice: run_graph: ==== Transition 32
(Complete=11, Pending=0, Fired=0, Skipped=1, Incomplete=0,
Source=/var/lib/pengine/pe-input-1907.bz2): Stopped
May 7 13:21:54 vm-1 crmd: [3039]: notice: do_state_transition: State
transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE [ input=I_PE_CALC
cause=C_FSA_INTERNAL origin=notify_crmd ]
May 7 13:21:54 vm-1 attrd: [3037]: notice: attrd_trigger_update:
Sending flush op to all hosts for: fail-count-p_apache:0 (INFINITY)
May 7 13:21:54 vm-1 attrd: [3037]: notice: attrd_perform_update: Sent
update 94: fail-count-p_apache:0=INFINITY
May 7 13:21:54 vm-1 pengine: [3038]: notice: unpack_config: On loss of
CCM Quorum: Ignore
May 7 13:21:54 vm-1 pengine: [3038]: WARN: unpack_rsc_op: Processing
failed op p_apache:0_last_failure_0 on vm-1: unknown exec error (-2)
May 7 13:21:54 vm-1 attrd: [3037]: notice: attrd_trigger_update:
Sending flush op to all hosts for: last-failure-p_apache:0 (1430997714)
May 7 13:21:54 vm-1 pengine: [3038]: notice: LogActions: Recover
p_apache:0#011(Started vm-1)
May 7 13:21:54 vm-1 attrd: [3037]: notice: attrd_perform_update: Sent
update 96: last-failure-p_apache:0=1430997714
May 7 13:21:54 vm-1 crmd: [3039]: info: abort_transition_graph:
te_update_diff:176 - Triggered transition abort (complete=1, tag=nvpair,
id=status-vm-1-fail-count-p_apache.0, name=fail-count-p_apache:0,
value=INFINITY, magic=NA, cib=0.502.75) : Transient attribute: update
May 7 13:21:54 vm-1 crmd: [3039]: info: handle_response: pe_calc
calculation pe_calc-dc-1430997714-238 is obsolete
May 7 13:21:54 vm-1 crmd: [3039]: info: abort_transition_graph:
te_update_diff:176 - Triggered transition abort (complete=1, tag=nvpair,
id=status-vm-1-last-failure-p_apache.0, name=last-failure-p_apache:0,
value=1430997714, magic=NA, cib=0.502.76) : Transient attribute: update
May 7 13:21:54 vm-1 pengine: [3038]: notice: process_pe_message:
Transition 33: PEngine Input stored in: /var/lib/pengine/pe-input-1908.bz2
May 7 13:21:54 vm-1 pengine: [3038]: notice: unpack_config: On loss of
CCM Quorum: Ignore
May 7 13:21:54 vm-1 pengine: [3038]: WARN: unpack_rsc_op: Processing
failed op p_apache:0_last_failure_0 on vm-1: unknown exec error (-2)
May 7 13:21:54 vm-1 pengine: [3038]: WARN: common_apply_stickiness:
Forcing cl_apache away from vm-1 after 1000000 failures (max=1000000)
May 7 13:21:54 vm-1 pengine: [3038]: WARN: common_apply_stickiness:
Forcing cl_apache away from vm-1 after 1000000 failures (max=1000000)
May 7 13:21:54 vm-1 pengine: [3038]: notice: LogActions: Stop
p_apache:0#011(vm-1)
May 7 13:21:54 vm-1 crmd: [3039]: notice: do_state_transition: State
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS
cause=C_IPC_MESSAGE origin=handle_response ]
May 7 13:21:54 vm-1 crmd: [3039]: info: do_te_invoke: Processing graph
34 (ref=pe_calc-dc-1430997714-239) derived from
/var/lib/pengine/pe-input-1909.bz2
May 7 13:21:54 vm-1 crmd: [3039]: info: te_rsc_command: Initiating
action 4: stop p_apache:0_stop_0 on vm-1 (local)
May 7 13:21:54 vm-1 lrmd: [3036]: info: rsc:p_apache:0 stop[59] (pid 28650)
May 7 13:21:54 vm-1 pengine: [3038]: notice: process_pe_message:
Transition 34: PEngine Input stored in: /var/lib/pengine/pe-input-1909.bz2
May 7 13:21:54 vm-1 apache[28650]: INFO: apache is not running.
May 7 13:21:54 vm-1 apache[28650]: INFO: apache children were signalled
(SIGTERM)
May 7 13:21:56 vm-1 apache[28650]: INFO: apache children were signalled
(SIGHUP)
May 7 13:21:57 vm-1 apache[28650]: INFO: apache children were signalled
(SIGKILL)
May 7 13:21:58 vm-1 lrmd: [3036]: info: operation stop[59] on
p_apache:0 for client 3039: pid 28650 exited with return code 0
May 7 13:21:58 vm-1 crmd: [3039]: info: process_lrm_event: LRM
operation p_apache:0_stop_0 (call=59, rc=0, cib-update=349,
confirmed=true) ok
May 7 13:21:58 vm-1 crmd: [3039]: notice: run_graph: ==== Transition 34
(Complete=4, Pending=0, Fired=0, Skipped=0, Incomplete=0,
Source=/var/lib/pengine/pe-input-1909.bz2): Complete
May 7 13:21:58 vm-1 crmd: [3039]: notice: do_state_transition: State
transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
cause=C_FSA_INTERNAL origin=notify_crmd ]
May 7 13:22:14 vm-1 stonith-ng: [3035]: info: stonith_command:
Processed st_execute from lrmd: rc=-1
May 7 13:22:15 vm-1 external/libvirt[28943]: [28958]: notice:
qemu+tcp://192.168.122.1/system: Running hypervisor: QEMU 2.0.0
May 7 13:22:16 vm-1 stonith: [28935]: info: external/libvirt device OK.
May 7 13:22:16 vm-1 stonith-ng: [3035]: info: log_operation:
p_fence_N2: Performing: stonith -t external/libvirt -S
May 7 13:22:16 vm-1 stonith-ng: [3035]: info: log_operation:
p_fence_N2: success: 0
Online: [ vm-1 ]
OFFLINE: [ vm-2 ]
Clone Set: cl_gluster_mnt [p_gluster_mnt]
Started: [ vm-1 ]
Stopped: [ p_gluster_mnt:1 ]
Clone Set: cl_IP [IP] (unique)
IP:0 (ocf::heartbeat:IPaddr2): Started vm-1
IP:1 (ocf::heartbeat:IPaddr2): Started vm-1
p_fence_N2 (stonith:external/libvirt): Started vm-1
Failed actions:
p_apache:0_start_0 (node=vm-1, call=58, rc=-2, status=Timed Out):
unknown exec error
More information about the Users
mailing list