<div dir="ltr"><div>Hi Dejan</div><div><br></div>It is giving following info. Then I tried <i>crm resource restart sc_vip</i> too but no trace found. Anything which I need to do more apart from this ?<div><br></div><div><div>root@sc-node-1:/var/lib/heartbeat# crm resource trace sc_vip stop</div><div>INFO: restart sc_vip to get the trace</div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Oct 29, 2015 at 2:10 PM, Dejan Muhamedagic <span dir="ltr"><<a href="mailto:dejanmm@fastmail.fm" target="_blank">dejanmm@fastmail.fm</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<br>

<span class=""><br>

On Thu, Oct 29, 2015 at 10:40:18AM +0530, Pritam Kharat wrote:<br>

> Thank you very much Ken for reply. I will try your suggested steps.<br>

<br>

</span>If you cannot figure out from the logs why the stop operation<br>

times out, you can also try to trace the resource agent:<br>

<br>

# crm resource help trace<br>

# crm resource trace vip stop<br>

<br>

Then take a look at the trace or post it somewhere.<br>

<br>

Thanks,<br>

<br>

Dejan<br>

<div class="HOEnZb"><div class="h5"><br>

><br>

> On Wed, Oct 28, 2015 at 11:23 PM, Ken Gaillot <<a href="mailto:kgaillot@redhat.com">kgaillot@redhat.com</a>> wrote:<br>

><br>

> > On 10/28/2015 03:51 AM, Pritam Kharat wrote:<br>

> > > Hi All,<br>

> > ><br>

> > > I am facing one issue in my two node HA. When I stop pacemaker on ACTIVE<br>

> > > node, it takes more time to stop and by this time VIP migration with<br>

> > other<br>

> > > resources migration fails to STANDBY node. (I have seen same issue in<br>

> > > ACTIVE node reboot case also)<br>

> ><br>

> > I assume STANDBY in this case is just a description of the node's<br>

> > purpose, and does not mean that you placed the node in pacemaker's<br>

> > standby mode. If the node really is in standby mode, it can't run any<br>

> > resources.<br>

> ><br>

> > > Last change: Wed Oct 28 02:52:57 2015 via cibadmin on node-1<br>

> > > Stack: corosync<br>

> > > Current DC: node-1 (1) - partition with quorum<br>

> > > Version: 1.1.10-42f2063<br>

> > > 2 Nodes configured<br>

> > > 2 Resources configured<br>

> > ><br>

> > ><br>

> > > Online: [ node-1 node-2 ]<br>

> > ><br>

> > > Full list of resources:<br>

> > ><br>

> > >  resource (upstart:resource): Stopped<br>

> > >  vip (ocf::heartbeat:IPaddr2): Started node-2 (unmanaged) FAILED<br>

> > ><br>

> > > Migration summary:<br>

> > > * Node node-1:<br>

> > > * Node node-2:<br>

> > ><br>

> > > Failed actions:<br>

> > >     vip_stop_0 (node=node-2, call=-1, rc=1, status=Timed Out,<br>

> > > last-rc-change=Wed Oct 28 03:05:24 2015<br>

> > > , queued=0ms, exec=0ms<br>

> > > ): unknown error<br>

> > ><br>

> > > VIP monitor is failing over here with error Timed Out. What is the<br>

> > general<br>

> > > reason for TimeOut. ? I have kept default-action-timeout=180secs which<br>

> > > should be enough for monitoring<br>

> ><br>

> > 180s should be far more than enough, so something must be going wrong.<br>

> > Notice that it is the stop operation on the active node that is failing.<br>

> > Normally in such a case, pacemaker would fence that node to be sure that<br>

> > it is safe to bring it up elsewhere, but you have disabled stonith.<br>

> ><br>

> > Fencing is important in failure recovery such as this, so it would be a<br>

> > good idea to try to get it implemented.<br>

> ><br>

> > > I have added order property -> when vip is started then only start other<br>

> > > resources.<br>

> > > Any clue to solve this problem ? Most of the time this VIP monitoring is<br>

> > > failing with Timed Out error.<br>

> ><br>

> > The "stop" in "vip_stop_0" means that the stop operation is what failed.<br>

> > Have you seen timeouts on any other operations?<br>

> ><br>

> > Look through the logs around the time of the failure, and try to see if<br>

> > there are any indications as to why the stop failed.<br>

> ><br>

> > If you can set aside some time for testing or have a test cluster that<br>

> > exhibits the same issue, you can try unmanaging the resource in<br>

> > pacemaker, then:<br>

> ><br>

> > 1. Try adding/removing the IP via normal system commands, and make sure<br>

> > that works.<br>

> ><br>

> > 2. Try running the resource agent manually (with any verbose option) to<br>

> > start/stop/monitor the IP to see if you can reproduce the problem and<br>

> > get more messages.<br>

> ><br>

> > _______________________________________________<br>

> > Users mailing list: <a href="mailto:Users@clusterlabs.org">Users@clusterlabs.org</a><br>

> > <a href="http://clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://clusterlabs.org/mailman/listinfo/users</a><br>

> ><br>

> > Project Home: <a href="http://www.clusterlabs.org" rel="noreferrer" target="_blank">http://www.clusterlabs.org</a><br>

> > Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" rel="noreferrer" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>

> > Bugs: <a href="http://bugs.clusterlabs.org" rel="noreferrer" target="_blank">http://bugs.clusterlabs.org</a><br>

> ><br>

><br>

><br>

><br>

> --<br>

> Thanks and Regards,<br>

> Pritam Kharat.<br>

<br>

> _______________________________________________<br>

> Users mailing list: <a href="mailto:Users@clusterlabs.org">Users@clusterlabs.org</a><br>

> <a href="http://clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://clusterlabs.org/mailman/listinfo/users</a><br>

><br>

> Project Home: <a href="http://www.clusterlabs.org" rel="noreferrer" target="_blank">http://www.clusterlabs.org</a><br>

> Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" rel="noreferrer" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>

> Bugs: <a href="http://bugs.clusterlabs.org" rel="noreferrer" target="_blank">http://bugs.clusterlabs.org</a><br>

<br>

<br>

_______________________________________________<br>

Users mailing list: <a href="mailto:Users@clusterlabs.org">Users@clusterlabs.org</a><br>

<a href="http://clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://clusterlabs.org/mailman/listinfo/users</a><br>

<br>

Project Home: <a href="http://www.clusterlabs.org" rel="noreferrer" target="_blank">http://www.clusterlabs.org</a><br>

Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" rel="noreferrer" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>

Bugs: <a href="http://bugs.clusterlabs.org" rel="noreferrer" target="_blank">http://bugs.clusterlabs.org</a><br>

</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature">Thanks and Regards,<br>Pritam Kharat.<br></div>

</div>