<div dir="ltr">This is my messages log.<br><div><br>Jul 27 08:02:46 vmx-occ-005 apache(WebSite)[32477]: INFO: apache not running<br>Jul 27 08:02:46 vmx-occ-005 crmd[31424]:   notice: process_lrm_event: Operation WebSite_monitor_60000: not running (node=node1, call=11, rc=7, cib-update=15, confirmed=false)<br>Jul 27 08:02:46 vmx-occ-005 attrd[31422]:   notice: attrd_cs_dispatch: Update relayed from node2<br>Jul 27 08:02:46 vmx-occ-005 attrd[31422]:   notice: attrd_trigger_update: Sending flush op to all hosts for: fail-count-WebSite (1)<br>Jul 27 08:02:46 vmx-occ-005 attrd[31422]:   notice: attrd_perform_update: Sent update 12: fail-count-WebSite=1<br>Jul 27 08:02:46 vmx-occ-005 attrd[31422]:   notice: attrd_cs_dispatch: Update relayed from node2<br>Jul 27 08:02:46 vmx-occ-005 attrd[31422]:   notice: attrd_trigger_update: Sending flush op to all hosts for: last-failure-WebSite (1437976962)<br>Jul 27 08:02:46 vmx-occ-005 attrd[31422]:   notice: attrd_perform_update: Sent update 14: last-failure-WebSite=1437976962<br>Jul 27 08:02:46 vmx-occ-005 apache(WebSite)[32511]: INFO: apache is not running.<br>Jul 27 08:02:46 vmx-occ-005 crmd[31424]:   notice: process_lrm_event: Operation WebSite_stop_0: ok (node=node1, call=14, rc=0, cib-update=16, confirmed=true)<br><br></div><div>this is my corosync log:<br><br><br>Jul 27 08:02:46 [31419] vmx-occ-005        cib:     info: cib_process_request:  Forwarding cib_modify operation for section status to master (origin=local/crmd/15)<br>Jul 27 08:02:46 [31419] vmx-occ-005        cib:     info: cib_perform_op:       Diff: --- 0.38.65 2<br>Jul 27 08:02:46 [31419] vmx-occ-005        cib:     info: cib_perform_op:       Diff: +++ 0.38.66 (null)<br>Jul 27 08:02:46 [31419] vmx-occ-005        cib:     info: cib_perform_op:       +  /cib:  @num_updates=66<br>Jul 27 08:02:46 [31419] vmx-occ-005        cib:     info: cib_perform_op:       +  /cib/status/node_state[@id='node1']/lrm[@id='node1']/lrm_resources/lrm_resource[@id='WebSite']/lrm_rsc_op[@id='WebSite_last_failure_0']:  @operation_key=WebSite_monitor_60000, @transition-key=9:119038:0:a5b747ee-4fbc-4f65-a690-29276791fd19, @transition-magic=0:7;9:119038:0:a5b747ee-4fbc-4f65-a690-29276791fd19, @call-id=11, @rc-code=7, @interval=60000, @last-rc-change=1437976966, @exec-time=0, @op-digest=eddc33bef3f1592ad847638ee4<br>Jul 27 08:02:46 [31419] vmx-occ-005        cib:     info: cib_process_request:  Completed cib_modify operation for section status: OK (rc=0, origin=node1/crmd/15, version=0.38.66)<br>Jul 27 08:02:46 [31422] vmx-occ-005      attrd:   notice: attrd_cs_dispatch:    Update relayed from node2<br>Jul 27 08:02:46 [31422] vmx-occ-005      attrd:   notice: attrd_trigger_update:         Sending flush op to all hosts for: fail-count-WebSite (1)<br>Jul 27 08:02:46 [31422] vmx-occ-005      attrd:   notice: attrd_perform_update:         Sent update 12: fail-count-WebSite=1<br>Jul 27 08:02:46 [31419] vmx-occ-005        cib:     info: cib_process_request:  Forwarding cib_modify operation for section status to master (origin=local/attrd/12)<br>Jul 27 08:02:46 [31422] vmx-occ-005      attrd:   notice: attrd_cs_dispatch:    Update relayed from node2<br>Jul 27 08:02:46 [31422] vmx-occ-005      attrd:   notice: attrd_trigger_update:         Sending flush op to all hosts for: last-failure-WebSite (1437976962)<br>Jul 27 08:02:46 [31419] vmx-occ-005        cib:     info: cib_perform_op:       Diff: --- 0.38.66 2<br>Jul 27 08:02:46 [31419] vmx-occ-005        cib:     info: cib_perform_op:       Diff: +++ 0.38.67 (null)<br>Jul 27 08:02:46 [31419] vmx-occ-005        cib:     info: cib_perform_op:       +  /cib:  @num_updates=67<br>Jul 27 08:02:46 [31419] vmx-occ-005        cib:     info: cib_perform_op:       ++ /cib/status/node_state[@id='node1']/transient_attributes[@id='node1']/instance_attributes[@id='status-node1']:  <nvpair id="status-node1-fail-count-WebSite" name="fail-count-WebSite" value="1"/><br>Jul 27 08:02:46 [31419] vmx-occ-005        cib:     info: cib_process_request:  Completed cib_modify operation for section status: OK (rc=0, origin=node1/attrd/12, version=0.38.67)<br>Jul 27 08:02:46 [31422] vmx-occ-005      attrd:   notice: attrd_perform_update:         Sent update 14: last-failure-WebSite=1437976962<br>Jul 27 08:02:46 [31419] vmx-occ-005        cib:     info: cib_process_request:  Forwarding cib_modify operation for section status to master (origin=local/attrd/14)<br>Jul 27 08:02:46 [31419] vmx-occ-005        cib:     info: cib_perform_op:       Diff: --- 0.38.67 2<br>Jul 27 08:02:46 [31419] vmx-occ-005        cib:     info: cib_perform_op:       Diff: +++ 0.38.68 (null)<br>Jul 27 08:02:46 [31419] vmx-occ-005        cib:     info: cib_perform_op:       +  /cib:  @num_updates=68<br>Jul 27 08:02:46 [31419] vmx-occ-005        cib:     info: cib_perform_op:       ++ /cib/status/node_state[@id='node1']/transient_attributes[@id='node1']/instance_attributes[@id='status-node1']:  <nvpair id="status-node1-last-failure-WebSite" name="last-failure-WebSite" value="1437976962"/><br>Jul 27 08:02:46 [31419] vmx-occ-005        cib:     info: cib_process_request:  Completed cib_modify operation for section status: OK (rc=0, origin=node1/attrd/14, version=0.38.68)<br>Jul 27 08:02:46 [31419] vmx-occ-005        cib:     info: cib_process_request:  Completed cib_modify operation for section status: OK (rc=0, origin=node2/attrd/404, version=0.38.68)<br>Jul 27 08:02:46 [31421] vmx-occ-005       lrmd:     info: cancel_recurring_action:      Cancelling operation WebSite_monitor_60000<br>Jul 27 08:02:46 [31424] vmx-occ-005       crmd:     info: do_lrm_rsc_op:        Performing key=3:119728:0:a5b747ee-4fbc-4f65-a690-29276791fd19 op=WebSite_stop_0<br>Jul 27 08:02:46 [31421] vmx-occ-005       lrmd:     info: log_execute:  executing - rsc:WebSite action:stop call_id:14<br>Jul 27 08:02:46 [31424] vmx-occ-005       crmd:     info: process_lrm_event:    Operation WebSite_monitor_60000: Cancelled (node=node1, call=11, confirmed=true)<br>apache(WebSite)[32511]: 2015/07/27_08:02:46 INFO: apache is not running.<br>Jul 27 08:02:46 [31421] vmx-occ-005       lrmd:     info: log_finished:         finished - rsc:WebSite action:stop call_id:14 pid:32511 exit-code:0 exec-time:167ms queue-time:0ms<br>Jul 27 08:02:46 [31424] vmx-occ-005       crmd:   notice: process_lrm_event:    Operation WebSite_stop_0: ok (node=node1, call=14, rc=0, cib-update=16, confirmed=true)<br>Jul 27 08:02:46 [31419] vmx-occ-005        cib:     info: cib_process_request:  Forwarding cib_modify operation for section status to master (origin=local/crmd/16)<br>Jul 27 08:02:46 [31419] vmx-occ-005        cib:     info: cib_perform_op:       Diff: --- 0.38.68 2<br>Jul 27 08:02:46 [31419] vmx-occ-005        cib:     info: cib_perform_op:       Diff: +++ 0.38.69 (null)<br>Jul 27 08:02:46 [31419] vmx-occ-005        cib:     info: cib_perform_op:       +  /cib:  @num_updates=69<br>Jul 27 08:02:46 [31419] vmx-occ-005        cib:     info: cib_perform_op:       +  /cib/status/node_state[@id='node1']/lrm[@id='node1']/lrm_resources/lrm_resource[@id='WebSite']/lrm_rsc_op[@id='WebSite_last_0']:  @operation_key=WebSite_stop_0, @operation=stop, @transition-key=3:119728:0:a5b747ee-4fbc-4f65-a690-29276791fd19, @transition-magic=0:0;3:119728:0:a5b747ee-4fbc-4f65-a690-29276791fd19, @call-id=14, @last-run=1437976966, @last-rc-change=1437976966, @exec-time=167<br>Jul 27 08:02:46 [31419] vmx-occ-005        cib:     info: cib_process_request:  Completed cib_modify operation for section status: OK (rc=0, origin=node1/crmd/16, version=0.38.69)<br>Jul 27 08:02:51 [31419] vmx-occ-005        cib:     info: cib_process_ping:     Reporting our current digest to node2: 608e7e54d63c1f66c39c9b4162a189d3 for 0.38.69 (0x846320 0)<br><br></div><div>These are the logs after i have triggered the failure. Pacemaker doesnt restarts the service automatically, even if i start the httpd service , the status i get is stopped on node 1. If i restart the cluster it works fine.<br><br></div><div><br></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Jul 27, 2015 at 11:30 AM, Vijay Partha <span dir="ltr"><<a href="mailto:vijaysarathy94@gmail.com" target="_blank">vijaysarathy94@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Could you help me out in configuring stonith properly. I am new to pacemaker and I have been working for a few days. What all logs do you require?<br></div><div class="gmail_extra"><br><div class="gmail_quote"><div><div class="h5">On Mon, Jul 27, 2015 at 11:22 AM, Digimer <span dir="ltr"><<a href="mailto:lists@alteeve.ca" target="_blank">lists@alteeve.ca</a>></span> wrote:<br></div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="h5"><div><div>On 27/07/15 01:35 AM, Vijay Partha wrote:<br>
> HI .<br>
><br>
> My configuration file looks like this:<br>
><br>
> <cib crm_feature_set="3.0.9" validate-with="pacemaker-2.0" epoch="38"<br>
> num_updates="0" admin_epoch="0" cib-last-written="Fri Jul 24 15:57:06<br>
> 2015" have-quorum="1" dc-uuid="node2"><br>
>   <configuration><br>
>     <crm_config><br>
>       <cluster_property_set id="cib-bootstrap-options"><br>
>         <nvpair id="cib-bootstrap-options-dc-version" name="dc-version"<br>
> value="1.1.11-97629de"/><br>
>         <nvpair id="cib-bootstrap-options-cluster-infrastructure"<br>
> name="cluster-infrastructure" value="cman"/><br>
>         <nvpair id="cib-bootstrap-options-stonith-enabled"<br>
> name="stonith-enabled" value="false"/><br>
>         <nvpair id="cib-bootstrap-options-no-quorum-policy"<br>
> name="no-quorum-policy" value="ignore"/><br>
>         <nvpair id="cib-bootstrap-options-cluster-recheck-interval"<br>
> name="cluster-recheck-interval" value="2s"/><br>
>       </cluster_property_set><br>
>     </crm_config><br>
>     <nodes><br>
>       <node id="node1" uname="node1"/><br>
>       <node id="node2" uname="node2"/><br>
>     </nodes><br>
>     <resources><br>
>       <primitive class="ocf" id="my_first_svc" provider="heartbeat"<br>
> type="Dummy"><br>
>         <instance_attributes id="my_first_svc-instance_attributes"/><br>
>         <operations><br>
>           <op id="my_first_svc-start-timeout-20" interval="0s"<br>
> name="start" timeout="20"/><br>
>           <op id="my_first_svc-stop-timeout-20" interval="0s"<br>
> name="stop" timeout="20"/><br>
>           <op id="my_first_svc-monitor-interval-120s" interval="120s"<br>
> name="monitor"/><br>
>         </operations><br>
>       </primitive><br>
>       <primitive class="ocf" id="WebSite" provider="heartbeat"<br>
> type="apache"><br>
>         <instance_attributes id="WebSite-instance_attributes"><br>
>           <nvpair id="WebSite-instance_attributes-configfile"<br>
> name="configfile" value="/etc/httpd/conf/httpd.conf"/><br>
>           <nvpair id="WebSite-instance_attributes-statusurl"<br>
</div></div></div></div>> name="statusurl" value="<a href="http://localhost/server-status" rel="noreferrer" target="_blank">http://localhost/server-status</a>"/><div><div class="h5"><br>
<div><div>>         </instance_attributes><br>
>         <operations><br>
>           <op id="WebSite-start-timeout-40s" interval="0s" name="start"<br>
> timeout="40s" on-fail="restart"/><br>
>           <op id="WebSite-stop-timeout-60s" interval="0s" name="stop"<br>
> timeout="60s" on-fail="restart"/><br>
>           <op id="WebSite-monitor-interval-1min" interval="1min"<br>
> name="monitor" on-fail="restart"/><br>
>         </operations><br>
>         <meta_attributes id="WebSite-meta_attributes"/><br>
>       </primitive><br>
>     </resources><br>
>     <constraints><br>
>       <rsc_location id="location-WebSite-node2-50" node="node2"<br>
> rsc="WebSite" score="50"/><br>
>   </constraints><br>
>     <rsc_defaults><br>
>       <meta_attributes id="rsc_defaults-options"><br>
>         <nvpair id="rsc_defaults-options-migration-threshold"<br>
> name="migration-threshold" value="1"/><br>
>       </meta_attributes><br>
>     </rsc_defaults><br>
>     <op_defaults><br>
>       <meta_attributes id="op_defaults-options"><br>
>         <nvpair id="op_defaults-options-timeout" name="timeout"<br>
> value="240s"/><br>
>       </meta_attributes><br>
>     </op_defaults><br>
>   </configuration><br>
> </cib><br>
><br>
> Once i stop the httpd service the pacemaker does not restarts it<br>
> automatically.<br>
<br>
</div></div>As mentioned, logs help a lot. The logs from all nodes starting before<br>
you trigger the failure until after the logs stop printing please.<br>
<br>
Also, you must use stonith. Please configure and test it. Often problems<br>
go away when stonith is configured and working properly.<br>
</div></div><span><br>
--<br>
Digimer<br>
Papers and Projects: <a href="https://alteeve.ca/w/" rel="noreferrer" target="_blank">https://alteeve.ca/w/</a><br>
</span><span class=""><span>What if the cure for cancer is trapped in the mind of a person without<br>
access to education?<br>
<br>
_______________________________________________<br>
</span></span>Users mailing list: <a href="mailto:Users@clusterlabs.org" target="_blank">Users@clusterlabs.org</a><span class=""><br>
<span><a href="http://clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://clusterlabs.org/mailman/listinfo/users</a><br>
<br>
Project Home: <a href="http://www.clusterlabs.org" rel="noreferrer" target="_blank">http://www.clusterlabs.org</a><br>
</span>Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" rel="noreferrer" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
Bugs: <a href="http://bugs.clusterlabs.org" rel="noreferrer" target="_blank">http://bugs.clusterlabs.org</a><br>
</span></blockquote></div><span class="HOEnZb"><font color="#888888"><br><br clear="all"><br>-- <br><div><div dir="ltr"><div>With Regards<br></div>P.Vijay<br></div></div>
</font></span></div>
</blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature"><div dir="ltr"><div>With Regards<br></div>P.Vijay<br></div></div>
</div>