<div dir="ltr"><div>Hi Ken,</div><div><br></div>When I ran ocf-tester to test IPaddr2 agent<div><br></div><div><i>ocf-tester -n sc_vip -o ip=192.168.20.188 -o cidr_netmask=24 -o nic=eth0 /usr/lib/ocf/resource.d/heartbeat/IPaddr2</i><br></div><div><br></div><div>I got this error - <i>ERROR: Setup problem: couldn't find command: ip </i>in <i>test_command monitor. </i>I verified ip command is there but still this error. What might be the reason for this error ? Is this okay ?</div><div><br></div><div><div>Running: export export OCF_RESOURCE_INSTANCE=sc_vip OCF_RESKEY_ip='192.168.20.188' OCF_RESKEY_cidr_netmask='24' OCF_RESKEY_nic='eth0'; bash /usr/lib/ocf/resource.d/heartbeat/IPaddr2 monitor 2>&1 > /dev/null</div><div><br></div><div>command_output: + : /usr/lib/ocf/lib/heartbeat + . /usr/lib/ocf/lib/heartbeat/ocf-shellfuncs ++ unset LC_ALL ++ export LC_ALL ++ unset LANGUAGE ++ export LANGUAGE +++ basename /usr/lib/ocf/resource.d/heartbeat/IPaddr2 ++ __SCRIPT_NAME=IPaddr2 ++ '[' -z /usr/lib/ocf ']' ++ '[' /usr/lib/ocf/lib/heartbeat = /usr/lib/ocf/resource.d/heartbeat ']' ++ : /usr/lib/ocf/lib/heartbeat ++ . /usr/lib/ocf/lib/heartbeat/ocf-binaries +++ PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/sbin:/bin:/usr/sbin:/usr/bin +++ PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/sbin:/bin:/usr/sbin:/usr/bin:/usr/ucb +++ export PATH +++ : mawk +++ : /bin/grep -E +++ : +++ : mail +++ : /bin/ping +++ : /bin/bash +++ : /usr/bin/test +++ : /usr/bin/test +++ : basename +++ : blockdev +++ : cat +++ : fsck +++ : fuser +++ : getent +++ : grep +++ : ifconfig +++ : iptables +++ : ip +++ : mdadm +++ : modprobe +++ : mount +++ : msgfmt +++ : netstat +++ : perl +++ : python +++ : raidstart +++ : raidstop +++ : route +++ : umount +++ : reboot +++ : poweroff +++ : wget +++ : whoami +++ : strings +++ : scp +++ : ssh +++ : swig +++ : gzip +++ : tar +++ : md5 +++ : drbdadm +++ : drbdsetup ++ . /usr/lib/ocf/lib/heartbeat/ocf-returncodes +++ OCF_SUCCESS=0 +++ OCF_ERR_GENERIC=1 +++ OCF_ERR_ARGS=2 +++ OCF_ERR_UNIMPLEMENTED=3 +++ OCF_ERR_PERM=4 +++ OCF_ERR_INSTALLED=5 +++ OCF_ERR_CONFIGURED=6 +++ OCF_NOT_RUNNING=7 +++ OCF_RUNNING_MASTER=8 +++ OCF_FAILED_MASTER=9 ++ . /usr/lib/ocf/lib/heartbeat/ocf-directories +++ prefix=/usr +++ exec_prefix=/usr +++ : /etc/init.d +++ : /etc/ha.d +++ : /etc/ha.d/rc.d +++ : /etc/ha.d/conf +++ : /etc/ha.d/<a href="http://ha.cf">ha.cf</a> +++ : /var/lib/heartbeat +++ : /var/run/resource-agents +++ : /var/run/heartbeat/rsctmp +++ : /var/lib/heartbeat/fifo +++ : /usr/lib/heartbeat +++ : /usr/sbin +++ : %Y/%m/%d_%T +++ : /dev/null +++ : /etc/ha.d/resource.d +++ : /usr/share/doc/heartbeat +++ : IPaddr2 +++ : /var/run/ +++ : /var/lock/subsys/ ++ . /usr/lib/ocf/lib/heartbeat/ocf-rarun ++ : 0 ++ __ocf_set_defaults monitor ++ __OCF_ACTION=monitor ++ unset LANG ++ LC_ALL=C ++ export LC_ALL ++ '[' -z '' ']' ++ : 0 ++ '[' '!' -d /usr/lib/ocf ']' ++ '[' -z '' ']' ++ : IPaddr2 ++ '[' -z '' ']' ++ : We are being invoked as an init script. ++ : Fill in some things with reasonable values. ++ : sc_vip ++ return 0 + OCF_RESKEY_lvs_support_default=false + OCF_RESKEY_clusterip_hash_default=sourceip-sourceport + OCF_RESKEY_unique_clone_address_default=false + OCF_RESKEY_arp_interval_default=200 + OCF_RESKEY_arp_count_default=5 + OCF_RESKEY_arp_bg_default=true + OCF_RESKEY_arp_mac_default=ffffffffffff + : false + : sourceip-sourceport + : false + : 200 + : 5 + : true + : ffffffffffff + SENDARP=/usr/lib/heartbeat/send_arp + FINDIF=/usr/lib/heartbeat/findif + VLDIR=/var/run/resource-agents + SENDARPPIDDIR=/var/run/resource-agents + CIP_lockfile=/var/run/resource-agents/IPaddr2-CIP-192.168.20.188 + ocf_is_true false + case "$1" in + false + case $__OCF_ACTION in + ip_validate + check_binary ip + have_binary ip + '[' 1 = 1 ']' + false + '[' 7 = 7 ']' + ocf_log err 'Setup problem: couldn'\''t find command: ip' + '[' 2 -lt 2 ']' + __OCF_PRIO=err + shift + __OCF_MSG='Setup problem: couldn'\''t find command: ip' + case "${__OCF_PRIO}" in + __OCF_PRIO=ERROR + '[' ERROR = DEBUG ']' + ha_log 'ERROR: Setup problem: couldn'\''t find command: ip' + local loglevel + '[' none = '' ']' + tty + '[' x = x0 -a x = xdebug ']' + '[' '' ']' + echo 'ERROR: Setup problem: couldn'\''t find command: ip' ERROR: Setup problem: couldn't find command: ip + return 0 + exit 5</div><div><br></div></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Oct 29, 2015 at 2:32 PM, Pritam Kharat <span dir="ltr"><<a href="mailto:pritam.kharat@oneconvergence.com" target="_blank">pritam.kharat@oneconvergence.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Hi Dejan</div><div><br></div>It is giving following info. Then I tried <i>crm resource restart sc_vip</i> too but no trace found. Anything which I need to do more apart from this ?<div><br></div><div><div>root@sc-node-1:/var/lib/heartbeat# crm resource trace sc_vip stop</div><div>INFO: restart sc_vip to get the trace</div></div></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Oct 29, 2015 at 2:10 PM, Dejan Muhamedagic <span dir="ltr"><<a href="mailto:dejanmm@fastmail.fm" target="_blank">dejanmm@fastmail.fm</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<br>
<span><br>
On Thu, Oct 29, 2015 at 10:40:18AM +0530, Pritam Kharat wrote:<br>
> Thank you very much Ken for reply. I will try your suggested steps.<br>
<br>
</span>If you cannot figure out from the logs why the stop operation<br>
times out, you can also try to trace the resource agent:<br>
<br>
# crm resource help trace<br>
# crm resource trace vip stop<br>
<br>
Then take a look at the trace or post it somewhere.<br>
<br>
Thanks,<br>
<br>
Dejan<br>
<div><div><br>
><br>
> On Wed, Oct 28, 2015 at 11:23 PM, Ken Gaillot <<a href="mailto:kgaillot@redhat.com" target="_blank">kgaillot@redhat.com</a>> wrote:<br>
><br>
> > On 10/28/2015 03:51 AM, Pritam Kharat wrote:<br>
> > > Hi All,<br>
> > ><br>
> > > I am facing one issue in my two node HA. When I stop pacemaker on ACTIVE<br>
> > > node, it takes more time to stop and by this time VIP migration with<br>
> > other<br>
> > > resources migration fails to STANDBY node. (I have seen same issue in<br>
> > > ACTIVE node reboot case also)<br>
> ><br>
> > I assume STANDBY in this case is just a description of the node's<br>
> > purpose, and does not mean that you placed the node in pacemaker's<br>
> > standby mode. If the node really is in standby mode, it can't run any<br>
> > resources.<br>
> ><br>
> > > Last change: Wed Oct 28 02:52:57 2015 via cibadmin on node-1<br>
> > > Stack: corosync<br>
> > > Current DC: node-1 (1) - partition with quorum<br>
> > > Version: 1.1.10-42f2063<br>
> > > 2 Nodes configured<br>
> > > 2 Resources configured<br>
> > ><br>
> > ><br>
> > > Online: [ node-1 node-2 ]<br>
> > ><br>
> > > Full list of resources:<br>
> > ><br>
> > > resource (upstart:resource): Stopped<br>
> > > vip (ocf::heartbeat:IPaddr2): Started node-2 (unmanaged) FAILED<br>
> > ><br>
> > > Migration summary:<br>
> > > * Node node-1:<br>
> > > * Node node-2:<br>
> > ><br>
> > > Failed actions:<br>
> > > vip_stop_0 (node=node-2, call=-1, rc=1, status=Timed Out,<br>
> > > last-rc-change=Wed Oct 28 03:05:24 2015<br>
> > > , queued=0ms, exec=0ms<br>
> > > ): unknown error<br>
> > ><br>
> > > VIP monitor is failing over here with error Timed Out. What is the<br>
> > general<br>
> > > reason for TimeOut. ? I have kept default-action-timeout=180secs which<br>
> > > should be enough for monitoring<br>
> ><br>
> > 180s should be far more than enough, so something must be going wrong.<br>
> > Notice that it is the stop operation on the active node that is failing.<br>
> > Normally in such a case, pacemaker would fence that node to be sure that<br>
> > it is safe to bring it up elsewhere, but you have disabled stonith.<br>
> ><br>
> > Fencing is important in failure recovery such as this, so it would be a<br>
> > good idea to try to get it implemented.<br>
> ><br>
> > > I have added order property -> when vip is started then only start other<br>
> > > resources.<br>
> > > Any clue to solve this problem ? Most of the time this VIP monitoring is<br>
> > > failing with Timed Out error.<br>
> ><br>
> > The "stop" in "vip_stop_0" means that the stop operation is what failed.<br>
> > Have you seen timeouts on any other operations?<br>
> ><br>
> > Look through the logs around the time of the failure, and try to see if<br>
> > there are any indications as to why the stop failed.<br>
> ><br>
> > If you can set aside some time for testing or have a test cluster that<br>
> > exhibits the same issue, you can try unmanaging the resource in<br>
> > pacemaker, then:<br>
> ><br>
> > 1. Try adding/removing the IP via normal system commands, and make sure<br>
> > that works.<br>
> ><br>
> > 2. Try running the resource agent manually (with any verbose option) to<br>
> > start/stop/monitor the IP to see if you can reproduce the problem and<br>
> > get more messages.<br>
> ><br>
> > _______________________________________________<br>
> > Users mailing list: <a href="mailto:Users@clusterlabs.org" target="_blank">Users@clusterlabs.org</a><br>
> > <a href="http://clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://clusterlabs.org/mailman/listinfo/users</a><br>
> ><br>
> > Project Home: <a href="http://www.clusterlabs.org" rel="noreferrer" target="_blank">http://www.clusterlabs.org</a><br>
> > Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" rel="noreferrer" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
> > Bugs: <a href="http://bugs.clusterlabs.org" rel="noreferrer" target="_blank">http://bugs.clusterlabs.org</a><br>
> ><br>
><br>
><br>
><br>
> --<br>
> Thanks and Regards,<br>
> Pritam Kharat.<br>
<br>
> _______________________________________________<br>
> Users mailing list: <a href="mailto:Users@clusterlabs.org" target="_blank">Users@clusterlabs.org</a><br>
> <a href="http://clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://clusterlabs.org/mailman/listinfo/users</a><br>
><br>
> Project Home: <a href="http://www.clusterlabs.org" rel="noreferrer" target="_blank">http://www.clusterlabs.org</a><br>
> Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" rel="noreferrer" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
> Bugs: <a href="http://bugs.clusterlabs.org" rel="noreferrer" target="_blank">http://bugs.clusterlabs.org</a><br>
<br>
<br>
_______________________________________________<br>
Users mailing list: <a href="mailto:Users@clusterlabs.org" target="_blank">Users@clusterlabs.org</a><br>
<a href="http://clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://clusterlabs.org/mailman/listinfo/users</a><br>
<br>
Project Home: <a href="http://www.clusterlabs.org" rel="noreferrer" target="_blank">http://www.clusterlabs.org</a><br>
Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" rel="noreferrer" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
Bugs: <a href="http://bugs.clusterlabs.org" rel="noreferrer" target="_blank">http://bugs.clusterlabs.org</a><br>
</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div>Thanks and Regards,<br>Pritam Kharat.<br></div>
</div>
</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature">Thanks and Regards,<br>Pritam Kharat.<br></div>
</div>