<div dir="ltr"><div>Hi Klaus</div><div><br></div>service mycustomprog status returns fine that is  no errors.   It does not hang.<div><br></div><div>Suresh</div><div><br><div class="gmail_extra"><br><div class="gmail_quote">On Sun, Jan 1, 2017 at 9:43 PM, Klaus Wenninger <span dir="ltr"><<a href="mailto:kwenning@redhat.com" target="_blank">kwenning@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Suresh!<br>
<br>
Have you tried lsb-status in a shell?<br>
Does it show anything interesting or is it hanging?<br>
<br>
Regards,<br>
Klaus<br>
<br>
On 12/30/2016 08:45 AM, Suresh Rajagopalan wrote:<br>
> Cluster running centos 6.8 with pacemaker/corosync.    This config was<br>
> running well for quite sometime. All of a sudden we see regular<br>
> restarts of the monitored process where corosync thinks it has<br>
> failed(even though it really has not failed).  I am showing the<br>
> relevant logs and config below.  Any pointers appreciated as it is not<br>
> clear why this would occur.<br>
><br>
> Thanks<br>
> Suresh<br>
><br>
> Dec 28 13:18:20 [2198] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>    pengine:     info:<br>
> LogActions:  Leave   mycustomprog       (Started <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>)<br>
> Dec 28 13:22:03 [2199] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>       crmd:     info:<br>
> process_lrm_event:   Operation mycustomprog_monitor_10000: not running<br>
> (node=<a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>, call=29, rc=7, cib-update=1427,<br>
> confirmed=false)<br>
> Dec 28 13:22:03 [2199] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>       crmd:   notice:<br>
> process_lrm_event:   a.b.com-mycustomprog_monitor_<wbr>10000:29 [<br>
> mycustomprogram (pid  15657) is running...\n ]<br>
> Dec 28 13:22:03 [2194] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>        cib:     info:<br>
> cib_perform_op:      ++ /cib/status/node_state[@id='<a href="http://a.b.com" rel="noreferrer" target="_blank">a.<wbr>b.com</a><br>
> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>']/lrm[@id='<a href="http://a.b.com" rel="noreferrer" target="_blank">a.<wbr>b.com</a><br>
> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>']/lrm_resourc<wbr>es/lrm_resource[@id='mycustomp<wbr>rog']:<br>
>  <lrm_rsc_op id="mycustomprog_last_failure_<wbr>0"<br>
> operation_key="mycustomprog_mo<wbr>nitor_10000" operation="monitor"<br>
> crm-debug-origin="do_update_re<wbr>source" crm_feature_set="3.0.10"<br>
> transition-key="7:462:0:a9dbbd<wbr>47-975b-4aee-8b4a-de56e0a8e7a7<wbr>"<br>
> transition-magic="0<br>
> Dec 28 13:22:03 [2199] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>       crmd:     info:<br>
> abort_transition_graph:      Transition aborted by<br>
> mycustomprog_monitor_10000 'create' on <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>: Old<br>
> event (magic=0:7;7:462:0:a9dbbd47-97<wbr>5b-4aee-8b4a-de56e0a8e7a7,<br>
> cib=0.48.2038786, source=process_graph_event:605<wbr>, 1)<br>
> Dec 28 13:22:03 [2199] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>       crmd:     info:<br>
> update_failcount:    Updating failcount for mycustomprog on <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a><br>
> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>> after failed monitor: rc=7 (update=value++,<br>
> time=1482931323)<br>
> Dec 28 13:22:03 [2199] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>       crmd:     info:<br>
> process_graph_event: Detected action (462.7)<br>
> mycustomprog_monitor_10000.29=<wbr>not running: failed<br>
> Dec 28 13:22:03 [2197] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>      attrd:   notice:<br>
> attrd_trigger_update:        Sending flush op to all hosts for:<br>
> fail-count-mycustomprog (1)<br>
> Dec 28 13:22:03 [2197] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>      attrd:   notice:<br>
> attrd_perform_update:        Sent update 18: fail-count-mycustomprog=1<br>
> Dec 28 13:22:03 [2194] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>        cib:     info:<br>
> cib_perform_op:      ++ /cib/status/node_state[@id='<a href="http://a.b.com" rel="noreferrer" target="_blank">a.<wbr>b.com</a><br>
> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>']/transient_a<wbr>ttributes[@id='<a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a><br>
> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>']/instance_at<wbr>tributes[@id='<a href="http://status-a.b.com" rel="noreferrer" target="_blank">status-a.b.com</a><br>
> <<a href="http://status-a.b.com" rel="noreferrer" target="_blank">http://status-a.b.com</a>>']:  <nvpair<br>
> id="status-a.b.com-fail-count-<wbr>mycustomprog"<br>
> name="fail-count-mycustomprog" value="1"/><br>
> Dec 28 13:22:03 [2197] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>      attrd:   notice:<br>
> attrd_trigger_update:        Sending flush op to all hosts for:<br>
> last-failure-mycustomprog (1482931323)<br>
> Dec 28 13:22:03 [2197] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>      attrd:   notice:<br>
> attrd_perform_update:        Sent update 20:<br>
> last-failure-mycustomprog=1482<wbr>931323<br>
> Dec 28 13:22:03 [2197] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>      attrd:   notice:<br>
> attrd_perform_update:        Sent update 20:<br>
> last-failure-mycustomprog=1482<wbr>931323<br>
> Dec 28 13:22:03 [2194] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>        cib:     info:<br>
> cib_perform_op:      ++ /cib/status/node_state[@id='<a href="http://a.b.com" rel="noreferrer" target="_blank">a.<wbr>b.com</a><br>
> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>']/transient_a<wbr>ttributes[@id='<a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a><br>
> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>']/instance_at<wbr>tributes[@id='<a href="http://status-a.b.com" rel="noreferrer" target="_blank">status-a.b.com</a><br>
> <<a href="http://status-a.b.com" rel="noreferrer" target="_blank">http://status-a.b.com</a>>']:  <nvpair<br>
> id="status-a.b.com-last-failur<wbr>e-mycustomprog"<br>
> name="last-failure-mycustompro<wbr>g" value="1482931323"/><br>
> Dec 28 13:22:04 [2199] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>       crmd:     info:<br>
> abort_transition_graph:      Transition aborted by<br>
> status-a.b.com-fail-count-mycu<wbr>stomprog, fail-count-mycustomprog=1:<br>
> Transient attribute change (create cib=0.48.2038787,<br>
> source=abort_unless_down:329, path=/cib/status/node_state[@i<wbr>d='<a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a><br>
> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>']/transient_a<wbr>ttributes[@id='<a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a><br>
> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>']/instance_at<wbr>tributes[@id='<a href="http://status-a.b.com" rel="noreferrer" target="_blank">status-a.b.com</a><br>
> <<a href="http://status-a.b.com" rel="noreferrer" target="_blank">http://status-a.b.com</a>><br>
> Dec 28 13:22:04 [2199] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>       crmd:     info:<br>
> abort_transition_graph:      Transition aborted by<br>
> status-a.b.com-last-failure-my<wbr>customprog,<br>
> last-failure-mycustomprog=1482<wbr>931323: Transient attribute change<br>
> (create cib=0.48.2038788, source=abort_unless_down:329,<br>
> path=/cib/status/node_state[@i<wbr>d='<a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a><br>
> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>']/transient_a<wbr>ttributes[@id='<a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a><br>
> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>']/instance_at<wbr>tributes[@id='status-macshii00<wbr>002-hva.gs.r11.<br>
> Dec 28 13:22:04 [2198] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>    pengine:  warning:<br>
> unpack_rsc_op_failure:       Processing failed op monitor for<br>
> mycustomprog on <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>: not running (7)<br>
> Dec 28 13:22:04 [2198] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>    pengine:     info:<br>
> native_print:        mycustomprog       (lsb:mycustomprog):<br>
>  FAILED <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>><br>
> Dec 28 13:22:04 [2198] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>    pengine:     info:<br>
> get_failcount_full:  mycustomprog has failed 1 times on <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a><br>
> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>><br>
> Dec 28 13:22:04 [2198] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>    pengine:     info:<br>
> common_apply_stickiness:     mycustomprog can fail 999999 more times<br>
> on <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>> before being forced off<br>
> Dec 28 13:22:04 [2198] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>    pengine:     info:<br>
> RecurringOp:  Start recurring monitor (10s) for mycustomprog on<br>
> <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>><br>
> Dec 28 13:22:04 [2198] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>    pengine:   notice:<br>
> LogActions:  Recover mycustomprog       (Started <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>)<br>
> Dec 28 13:22:04 [2199] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>       crmd:   notice:<br>
> te_rsc_command:      Initiating action 5: stop mycustomprog_stop_0 on<br>
> <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>> (local)<br>
> Dec 28 13:22:04 [2196] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>       lrmd:     info:<br>
> cancel_recurring_action:     Cancelling lsb operation<br>
> mycustomprog_status_10000Dec 28 13:18:20 [2198] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a><br>
> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>    pengine:     info: LogActions:  Leave<br>
> mycustomprog       (Started <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>)<br>
> Dec 28 13:22:03 [2199] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>       crmd:     info:<br>
> process_lrm_event:   Operation mycustomprog_monitor_10000: not running<br>
> (node=<a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>, call=29, rc=7, cib-update=1427,<br>
> confirmed=false)<br>
> Dec 28 13:22:03 [2199] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>       crmd:   notice:<br>
> process_lrm_event:   a.b.com-mycustomprog_monitor_<wbr>10000:29 [<br>
> mycustomprogram (pid  15657) is running...\n ]<br>
> Dec 28 13:22:03 [2194] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>        cib:     info:<br>
> cib_perform_op:      ++ /cib/status/node_state[@id='<a href="http://a.b.com" rel="noreferrer" target="_blank">a.<wbr>b.com</a><br>
> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>']/lrm[@id='<a href="http://a.b.com" rel="noreferrer" target="_blank">a.<wbr>b.com</a><br>
> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>']/lrm_resourc<wbr>es/lrm_resource[@id='mycustomp<wbr>rog']:<br>
>  <lrm_rsc_op id="mycustomprog_last_failure_<wbr>0"<br>
> operation_key="mycustomprog_mo<wbr>nitor_10000" operation="monitor"<br>
> crm-debug-origin="do_update_re<wbr>source" crm_feature_set="3.0.10"<br>
> transition-key="7:462:0:a9dbbd<wbr>47-975b-4aee-8b4a-de56e0a8e7a7<wbr>"<br>
> transition-magic="0<br>
> Dec 28 13:22:03 [2199] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>       crmd:     info:<br>
> abort_transition_graph:      Transition aborted by<br>
> mycustomprog_monitor_10000 'create' on <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>: Old<br>
> event (magic=0:7;7:462:0:a9dbbd47-97<wbr>5b-4aee-8b4a-de56e0a8e7a7,<br>
> cib=0.48.2038786, source=process_graph_event:605<wbr>, 1)<br>
> Dec 28 13:22:03 [2199] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>       crmd:     info:<br>
> update_failcount:    Updating failcount for mycustomprog on <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a><br>
> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>> after failed monitor: rc=7 (update=value++,<br>
> time=1482931323)<br>
> Dec 28 13:22:03 [2199] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>       crmd:     info:<br>
> process_graph_event: Detected action (462.7)<br>
> mycustomprog_monitor_10000.29=<wbr>not running: failed<br>
> Dec 28 13:22:03 [2197] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>      attrd:   notice:<br>
> attrd_trigger_update:        Sending flush op to all hosts for:<br>
> fail-count-mycustomprog (1)<br>
> Dec 28 13:22:03 [2197] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>      attrd:   notice:<br>
> attrd_perform_update:        Sent update 18: fail-count-mycustomprog=1<br>
> Dec 28 13:22:03 [2194] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>        cib:     info:<br>
> cib_perform_op:      ++ /cib/status/node_state[@id='<a href="http://a.b.com" rel="noreferrer" target="_blank">a.<wbr>b.com</a><br>
> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>']/transient_a<wbr>ttributes[@id='<a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a><br>
> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>']/instance_at<wbr>tributes[@id='<a href="http://status-a.b.com" rel="noreferrer" target="_blank">status-a.b.com</a><br>
> <<a href="http://status-a.b.com" rel="noreferrer" target="_blank">http://status-a.b.com</a>>']:  <nvpair<br>
> id="status-a.b.com-fail-count-<wbr>mycustomprog"<br>
> name="fail-count-mycustomprog" value="1"/><br>
> Dec 28 13:22:03 [2197] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>      attrd:   notice:<br>
> attrd_trigger_update:        Sending flush op to all hosts for:<br>
> last-failure-mycustomprog (1482931323)<br>
> Dec 28 13:22:03 [2197] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>      attrd:   notice:<br>
> attrd_perform_update:        Sent update 20:<br>
> last-failure-mycustomprog=1482<wbr>931323<br>
> Dec 28 13:22:03 [2197] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>      attrd:   notice:<br>
> attrd_perform_update:        Sent update 20:<br>
> last-failure-mycustomprog=1482<wbr>931323<br>
> Dec 28 13:22:03 [2194] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>        cib:     info:<br>
> cib_perform_op:      ++ /cib/status/node_state[@id='<a href="http://a.b.com" rel="noreferrer" target="_blank">a.<wbr>b.com</a><br>
> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>']/transient_a<wbr>ttributes[@id='<a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a><br>
> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>']/instance_at<wbr>tributes[@id='<a href="http://status-a.b.com" rel="noreferrer" target="_blank">status-a.b.com</a><br>
> <<a href="http://status-a.b.com" rel="noreferrer" target="_blank">http://status-a.b.com</a>>']:  <nvpair<br>
> id="status-a.b.com-last-failur<wbr>e-mycustomprog"<br>
> name="last-failure-mycustompro<wbr>g" value="1482931323"/><br>
> Dec 28 13:22:04 [2199] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>       crmd:     info:<br>
> abort_transition_graph:      Transition aborted by<br>
> status-a.b.com-fail-count-mycu<wbr>stomprog, fail-count-mycustomprog=1:<br>
> Transient attribute change (create cib=0.48.2038787,<br>
> source=abort_unless_down:329, path=/cib/status/node_state[@i<wbr>d='<a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a><br>
> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>']/transient_a<wbr>ttributes[@id='<a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a><br>
> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>']/instance_at<wbr>tributes[@id='<a href="http://status-a.b.com" rel="noreferrer" target="_blank">status-a.b.com</a><br>
> <<a href="http://status-a.b.com" rel="noreferrer" target="_blank">http://status-a.b.com</a>><br>
> Dec 28 13:22:04 [2199] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>       crmd:     info:<br>
> abort_transition_graph:      Transition aborted by<br>
> status-a.b.com-last-failure-my<wbr>customprog,<br>
> last-failure-mycustomprog=1482<wbr>931323: Transient attribute change<br>
> (create cib=0.48.2038788, source=abort_unless_down:329,<br>
> path=/cib/status/node_state[@i<wbr>d='<a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a><br>
> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>']/transient_a<wbr>ttributes[@id='<a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a><br>
> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>']/instance_at<wbr>tributes[@id='<a href="http://status-a.b.com" rel="noreferrer" target="_blank">status-a.b.com</a><br>
> <<a href="http://status-a.b.com" rel="noreferrer" target="_blank">http://status-a.b.com</a>><br>
> Dec 28 13:22:04 [2198] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>    pengine:  warning:<br>
> unpack_rsc_op_failure:       Processing failed op monitor for<br>
> mycustomprog on <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>: not running (7)<br>
> Dec 28 13:22:04 [2198] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>    pengine:     info:<br>
> native_print:        mycustomprog       (lsb:mycustomprog):<br>
>  FAILED <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>><br>
> Dec 28 13:22:04 [2198] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>    pengine:     info:<br>
> get_failcount_full:  mycustomprog has failed 1 times on <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a><br>
> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>><br>
> Dec 28 13:22:04 [2198] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>    pengine:     info:<br>
> common_apply_stickiness:     mycustomprog can fail 999999 more times<br>
> on <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>> before being forced off<br>
> Dec 28 13:22:04 [2198] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>    pengine:     info:<br>
> RecurringOp:  Start recurring monitor (10s) for mycustomprog on<br>
> <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>><br>
> Dec 28 13:22:04 [2198] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>    pengine:   notice:<br>
> LogActions:  Recover mycustomprog       (Started <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>)<br>
> Dec 28 13:22:04 [2199] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>       crmd:   notice:<br>
> te_rsc_command:      Initiating action 5: stop mycustomprog_stop_0 on<br>
> <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>> (local)<br>
> Dec 28 13:22:04 [2196] <a href="http://a.b.com" rel="noreferrer" target="_blank">a.b.com</a> <<a href="http://a.b.com" rel="noreferrer" target="_blank">http://a.b.com</a>>       lrmd:     info:<br>
> cancel_recurring_action:     Cancelling lsb operation<br>
> mycustomprog_status_10000<br>
><br>
><br>
><br>
> pcs config:<br>
><br>
> source settings.rc<br>
><br>
> pcs property set stonith-enabled=false<br>
> pcs property set no-quorum-policy=ignore<br>
> pcs resource create ClusterIP2 IPaddr2 ip=$MYVIP cidr_netmask=$NETMASKVIP1<br>
> pcs resource create ClusterIP3 IPaddr2 ip=$MYVIP2<br>
> cidr_netmask=$NETMASKVIP2<br>
> pcs resource create mycustomprog lsb:mycustomprog op monitor<br>
> interval="10s"<br>
> pcs constraint colocation add ClusterIP3 with ClusterIP2 INFINITY<br>
> pcs constraint colocation add mycustomprog with ClusterIP2 INFINITY<br>
> pcs property set start-failure-is-fatal=false<br>
> pcs resource defaults resource-stickiness=100<br>
> pcs constraint colocation add chkhealth with ClusterIP2 INFINITY<br>
><br>
><br>
><br>
> ______________________________<wbr>_________________<br>
> Users mailing list: <a href="mailto:Users@clusterlabs.org" target="_blank">Users@clusterlabs.org</a><br>
> <a href="http://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.clusterlabs.org/m<wbr>ailman/listinfo/users</a><br>
><br>
> Project Home: <a href="http://www.clusterlabs.org" rel="noreferrer" target="_blank">http://www.clusterlabs.org</a><br>
> Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" rel="noreferrer" target="_blank">http://www.clusterlabs.org/doc<wbr>/Cluster_from_Scratch.pdf</a><br>
> Bugs: <a href="http://bugs.clusterlabs.org" rel="noreferrer" target="_blank">http://bugs.clusterlabs.org</a><br>
<br>
<br>
<br>
______________________________<wbr>_________________<br>
Users mailing list: <a href="mailto:Users@clusterlabs.org" target="_blank">Users@clusterlabs.org</a><br>
<a href="http://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.clusterlabs.org/m<wbr>ailman/listinfo/users</a><br>
<br>
Project Home: <a href="http://www.clusterlabs.org" rel="noreferrer" target="_blank">http://www.clusterlabs.org</a><br>
Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" rel="noreferrer" target="_blank">http://www.clusterlabs.org/doc<wbr>/Cluster_from_Scratch.pdf</a><br>
Bugs: <a href="http://bugs.clusterlabs.org" rel="noreferrer" target="_blank">http://bugs.clusterlabs.org</a><br>
</blockquote></div><br></div></div></div>