<div dir="ltr">This is my messages log.<br><div><br>Jul 27 08:02:46 vmx-occ-005 apache(WebSite)[32477]: INFO: apache not running<br>Jul 27 08:02:46 vmx-occ-005 crmd[31424]: notice: process_lrm_event: Operation WebSite_monitor_60000: not running (node=node1, call=11, rc=7, cib-update=15, confirmed=false)<br>Jul 27 08:02:46 vmx-occ-005 attrd[31422]: notice: attrd_cs_dispatch: Update relayed from node2<br>Jul 27 08:02:46 vmx-occ-005 attrd[31422]: notice: attrd_trigger_update: Sending flush op to all hosts for: fail-count-WebSite (1)<br>Jul 27 08:02:46 vmx-occ-005 attrd[31422]: notice: attrd_perform_update: Sent update 12: fail-count-WebSite=1<br>Jul 27 08:02:46 vmx-occ-005 attrd[31422]: notice: attrd_cs_dispatch: Update relayed from node2<br>Jul 27 08:02:46 vmx-occ-005 attrd[31422]: notice: attrd_trigger_update: Sending flush op to all hosts for: last-failure-WebSite (1437976962)<br>Jul 27 08:02:46 vmx-occ-005 attrd[31422]: notice: attrd_perform_update: Sent update 14: last-failure-WebSite=1437976962<br>Jul 27 08:02:46 vmx-occ-005 apache(WebSite)[32511]: INFO: apache is not running.<br>Jul 27 08:02:46 vmx-occ-005 crmd[31424]: notice: process_lrm_event: Operation WebSite_stop_0: ok (node=node1, call=14, rc=0, cib-update=16, confirmed=true)<br><br></div><div>this is my corosync log:<br><br><br>Jul 27 08:02:46 [31419] vmx-occ-005 cib: info: cib_process_request: Forwarding cib_modify operation for section status to master (origin=local/crmd/15)<br>Jul 27 08:02:46 [31419] vmx-occ-005 cib: info: cib_perform_op: Diff: --- 0.38.65 2<br>Jul 27 08:02:46 [31419] vmx-occ-005 cib: info: cib_perform_op: Diff: +++ 0.38.66 (null)<br>Jul 27 08:02:46 [31419] vmx-occ-005 cib: info: cib_perform_op: + /cib: @num_updates=66<br>Jul 27 08:02:46 [31419] vmx-occ-005 cib: info: cib_perform_op: + /cib/status/node_state[@id='node1']/lrm[@id='node1']/lrm_resources/lrm_resource[@id='WebSite']/lrm_rsc_op[@id='WebSite_last_failure_0']: @operation_key=WebSite_monitor_60000, @transition-key=9:119038:0:a5b747ee-4fbc-4f65-a690-29276791fd19, @transition-magic=0:7;9:119038:0:a5b747ee-4fbc-4f65-a690-29276791fd19, @call-id=11, @rc-code=7, @interval=60000, @last-rc-change=1437976966, @exec-time=0, @op-digest=eddc33bef3f1592ad847638ee4<br>Jul 27 08:02:46 [31419] vmx-occ-005 cib: info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=node1/crmd/15, version=0.38.66)<br>Jul 27 08:02:46 [31422] vmx-occ-005 attrd: notice: attrd_cs_dispatch: Update relayed from node2<br>Jul 27 08:02:46 [31422] vmx-occ-005 attrd: notice: attrd_trigger_update: Sending flush op to all hosts for: fail-count-WebSite (1)<br>Jul 27 08:02:46 [31422] vmx-occ-005 attrd: notice: attrd_perform_update: Sent update 12: fail-count-WebSite=1<br>Jul 27 08:02:46 [31419] vmx-occ-005 cib: info: cib_process_request: Forwarding cib_modify operation for section status to master (origin=local/attrd/12)<br>Jul 27 08:02:46 [31422] vmx-occ-005 attrd: notice: attrd_cs_dispatch: Update relayed from node2<br>Jul 27 08:02:46 [31422] vmx-occ-005 attrd: notice: attrd_trigger_update: Sending flush op to all hosts for: last-failure-WebSite (1437976962)<br>Jul 27 08:02:46 [31419] vmx-occ-005 cib: info: cib_perform_op: Diff: --- 0.38.66 2<br>Jul 27 08:02:46 [31419] vmx-occ-005 cib: info: cib_perform_op: Diff: +++ 0.38.67 (null)<br>Jul 27 08:02:46 [31419] vmx-occ-005 cib: info: cib_perform_op: + /cib: @num_updates=67<br>Jul 27 08:02:46 [31419] vmx-occ-005 cib: info: cib_perform_op: ++ /cib/status/node_state[@id='node1']/transient_attributes[@id='node1']/instance_attributes[@id='status-node1']: <nvpair id="status-node1-fail-count-WebSite" name="fail-count-WebSite" value="1"/><br>Jul 27 08:02:46 [31419] vmx-occ-005 cib: info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=node1/attrd/12, version=0.38.67)<br>Jul 27 08:02:46 [31422] vmx-occ-005 attrd: notice: attrd_perform_update: Sent update 14: last-failure-WebSite=1437976962<br>Jul 27 08:02:46 [31419] vmx-occ-005 cib: info: cib_process_request: Forwarding cib_modify operation for section status to master (origin=local/attrd/14)<br>Jul 27 08:02:46 [31419] vmx-occ-005 cib: info: cib_perform_op: Diff: --- 0.38.67 2<br>Jul 27 08:02:46 [31419] vmx-occ-005 cib: info: cib_perform_op: Diff: +++ 0.38.68 (null)<br>Jul 27 08:02:46 [31419] vmx-occ-005 cib: info: cib_perform_op: + /cib: @num_updates=68<br>Jul 27 08:02:46 [31419] vmx-occ-005 cib: info: cib_perform_op: ++ /cib/status/node_state[@id='node1']/transient_attributes[@id='node1']/instance_attributes[@id='status-node1']: <nvpair id="status-node1-last-failure-WebSite" name="last-failure-WebSite" value="1437976962"/><br>Jul 27 08:02:46 [31419] vmx-occ-005 cib: info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=node1/attrd/14, version=0.38.68)<br>Jul 27 08:02:46 [31419] vmx-occ-005 cib: info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=node2/attrd/404, version=0.38.68)<br>Jul 27 08:02:46 [31421] vmx-occ-005 lrmd: info: cancel_recurring_action: Cancelling operation WebSite_monitor_60000<br>Jul 27 08:02:46 [31424] vmx-occ-005 crmd: info: do_lrm_rsc_op: Performing key=3:119728:0:a5b747ee-4fbc-4f65-a690-29276791fd19 op=WebSite_stop_0<br>Jul 27 08:02:46 [31421] vmx-occ-005 lrmd: info: log_execute: executing - rsc:WebSite action:stop call_id:14<br>Jul 27 08:02:46 [31424] vmx-occ-005 crmd: info: process_lrm_event: Operation WebSite_monitor_60000: Cancelled (node=node1, call=11, confirmed=true)<br>apache(WebSite)[32511]: 2015/07/27_08:02:46 INFO: apache is not running.<br>Jul 27 08:02:46 [31421] vmx-occ-005 lrmd: info: log_finished: finished - rsc:WebSite action:stop call_id:14 pid:32511 exit-code:0 exec-time:167ms queue-time:0ms<br>Jul 27 08:02:46 [31424] vmx-occ-005 crmd: notice: process_lrm_event: Operation WebSite_stop_0: ok (node=node1, call=14, rc=0, cib-update=16, confirmed=true)<br>Jul 27 08:02:46 [31419] vmx-occ-005 cib: info: cib_process_request: Forwarding cib_modify operation for section status to master (origin=local/crmd/16)<br>Jul 27 08:02:46 [31419] vmx-occ-005 cib: info: cib_perform_op: Diff: --- 0.38.68 2<br>Jul 27 08:02:46 [31419] vmx-occ-005 cib: info: cib_perform_op: Diff: +++ 0.38.69 (null)<br>Jul 27 08:02:46 [31419] vmx-occ-005 cib: info: cib_perform_op: + /cib: @num_updates=69<br>Jul 27 08:02:46 [31419] vmx-occ-005 cib: info: cib_perform_op: + /cib/status/node_state[@id='node1']/lrm[@id='node1']/lrm_resources/lrm_resource[@id='WebSite']/lrm_rsc_op[@id='WebSite_last_0']: @operation_key=WebSite_stop_0, @operation=stop, @transition-key=3:119728:0:a5b747ee-4fbc-4f65-a690-29276791fd19, @transition-magic=0:0;3:119728:0:a5b747ee-4fbc-4f65-a690-29276791fd19, @call-id=14, @last-run=1437976966, @last-rc-change=1437976966, @exec-time=167<br>Jul 27 08:02:46 [31419] vmx-occ-005 cib: info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=node1/crmd/16, version=0.38.69)<br>Jul 27 08:02:51 [31419] vmx-occ-005 cib: info: cib_process_ping: Reporting our current digest to node2: 608e7e54d63c1f66c39c9b4162a189d3 for 0.38.69 (0x846320 0)<br><br></div><div>These are the logs after i have triggered the failure. Pacemaker doesnt restarts the service automatically, even if i start the httpd service , the status i get is stopped on node 1. If i restart the cluster it works fine.<br><br></div><div><br></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Jul 27, 2015 at 11:30 AM, Vijay Partha <span dir="ltr"><<a href="mailto:vijaysarathy94@gmail.com" target="_blank">vijaysarathy94@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Could you help me out in configuring stonith properly. I am new to pacemaker and I have been working for a few days. What all logs do you require?<br></div><div class="gmail_extra"><br><div class="gmail_quote"><div><div class="h5">On Mon, Jul 27, 2015 at 11:22 AM, Digimer <span dir="ltr"><<a href="mailto:lists@alteeve.ca" target="_blank">lists@alteeve.ca</a>></span> wrote:<br></div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="h5"><div><div>On 27/07/15 01:35 AM, Vijay Partha wrote:<br>
> HI .<br>
><br>
> My configuration file looks like this:<br>
><br>
> <cib crm_feature_set="3.0.9" validate-with="pacemaker-2.0" epoch="38"<br>
> num_updates="0" admin_epoch="0" cib-last-written="Fri Jul 24 15:57:06<br>
> 2015" have-quorum="1" dc-uuid="node2"><br>
> <configuration><br>
> <crm_config><br>
> <cluster_property_set id="cib-bootstrap-options"><br>
> <nvpair id="cib-bootstrap-options-dc-version" name="dc-version"<br>
> value="1.1.11-97629de"/><br>
> <nvpair id="cib-bootstrap-options-cluster-infrastructure"<br>
> name="cluster-infrastructure" value="cman"/><br>
> <nvpair id="cib-bootstrap-options-stonith-enabled"<br>
> name="stonith-enabled" value="false"/><br>
> <nvpair id="cib-bootstrap-options-no-quorum-policy"<br>
> name="no-quorum-policy" value="ignore"/><br>
> <nvpair id="cib-bootstrap-options-cluster-recheck-interval"<br>
> name="cluster-recheck-interval" value="2s"/><br>
> </cluster_property_set><br>
> </crm_config><br>
> <nodes><br>
> <node id="node1" uname="node1"/><br>
> <node id="node2" uname="node2"/><br>
> </nodes><br>
> <resources><br>
> <primitive class="ocf" id="my_first_svc" provider="heartbeat"<br>
> type="Dummy"><br>
> <instance_attributes id="my_first_svc-instance_attributes"/><br>
> <operations><br>
> <op id="my_first_svc-start-timeout-20" interval="0s"<br>
> name="start" timeout="20"/><br>
> <op id="my_first_svc-stop-timeout-20" interval="0s"<br>
> name="stop" timeout="20"/><br>
> <op id="my_first_svc-monitor-interval-120s" interval="120s"<br>
> name="monitor"/><br>
> </operations><br>
> </primitive><br>
> <primitive class="ocf" id="WebSite" provider="heartbeat"<br>
> type="apache"><br>
> <instance_attributes id="WebSite-instance_attributes"><br>
> <nvpair id="WebSite-instance_attributes-configfile"<br>
> name="configfile" value="/etc/httpd/conf/httpd.conf"/><br>
> <nvpair id="WebSite-instance_attributes-statusurl"<br>
</div></div></div></div>> name="statusurl" value="<a href="http://localhost/server-status" rel="noreferrer" target="_blank">http://localhost/server-status</a>"/><div><div class="h5"><br>
<div><div>> </instance_attributes><br>
> <operations><br>
> <op id="WebSite-start-timeout-40s" interval="0s" name="start"<br>
> timeout="40s" on-fail="restart"/><br>
> <op id="WebSite-stop-timeout-60s" interval="0s" name="stop"<br>
> timeout="60s" on-fail="restart"/><br>
> <op id="WebSite-monitor-interval-1min" interval="1min"<br>
> name="monitor" on-fail="restart"/><br>
> </operations><br>
> <meta_attributes id="WebSite-meta_attributes"/><br>
> </primitive><br>
> </resources><br>
> <constraints><br>
> <rsc_location id="location-WebSite-node2-50" node="node2"<br>
> rsc="WebSite" score="50"/><br>
> </constraints><br>
> <rsc_defaults><br>
> <meta_attributes id="rsc_defaults-options"><br>
> <nvpair id="rsc_defaults-options-migration-threshold"<br>
> name="migration-threshold" value="1"/><br>
> </meta_attributes><br>
> </rsc_defaults><br>
> <op_defaults><br>
> <meta_attributes id="op_defaults-options"><br>
> <nvpair id="op_defaults-options-timeout" name="timeout"<br>
> value="240s"/><br>
> </meta_attributes><br>
> </op_defaults><br>
> </configuration><br>
> </cib><br>
><br>
> Once i stop the httpd service the pacemaker does not restarts it<br>
> automatically.<br>
<br>
</div></div>As mentioned, logs help a lot. The logs from all nodes starting before<br>
you trigger the failure until after the logs stop printing please.<br>
<br>
Also, you must use stonith. Please configure and test it. Often problems<br>
go away when stonith is configured and working properly.<br>
</div></div><span><br>
--<br>
Digimer<br>
Papers and Projects: <a href="https://alteeve.ca/w/" rel="noreferrer" target="_blank">https://alteeve.ca/w/</a><br>
</span><span class=""><span>What if the cure for cancer is trapped in the mind of a person without<br>
access to education?<br>
<br>
_______________________________________________<br>
</span></span>Users mailing list: <a href="mailto:Users@clusterlabs.org" target="_blank">Users@clusterlabs.org</a><span class=""><br>
<span><a href="http://clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://clusterlabs.org/mailman/listinfo/users</a><br>
<br>
Project Home: <a href="http://www.clusterlabs.org" rel="noreferrer" target="_blank">http://www.clusterlabs.org</a><br>
</span>Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" rel="noreferrer" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
Bugs: <a href="http://bugs.clusterlabs.org" rel="noreferrer" target="_blank">http://bugs.clusterlabs.org</a><br>
</span></blockquote></div><span class="HOEnZb"><font color="#888888"><br><br clear="all"><br>-- <br><div><div dir="ltr"><div>With Regards<br></div>P.Vijay<br></div></div>
</font></span></div>
</blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature"><div dir="ltr"><div>With Regards<br></div>P.Vijay<br></div></div>
</div>