[Pacemaker] Occasional nonsensical resource agent errors, redux

Mon Nov 3 00:46:00 EST 2014

В Mon, 3 Nov 2014 13:32:45 +1100
Andrew Beekhof <andrew at beekhof.net> пишет:

> 
> > On 1 Nov 2014, at 11:03 pm, Patrick Kane <pmk at wawd.com> wrote:
> > 
> > Hi all:
> > 
> > In July, list member Ken Gaillot reported occasional nonsensical resource agent errors using Pacemaker (http://oss.clusterlabs.org/pipermail/pacemaker/2014-July/022231.html).
> > 
> > We're seeing similar issues with our install.  We have a 2 node corosync/pacemaker failover configuration that is using the ocf:heartbeat:IPaddr2 resource agent extensively.  About once a week, we'll get an error like this, out of the blue:
> > 
> >   Nov  1 05:23:57 lb02 IPaddr2(anon_ip)[32312]: ERROR: Setup problem: couldn't find command: ip
> > 
> > It goes without saying that the ip command hasn't gone anywhere and all the paths are configured correctly.
> > 
> > We're currently running 1.1.10-14.el6_5.3-368c726 under CentOS 6 x86_64 inside of a xen container.
> > 
> > Any thoughts from folks on what might be happening or how we can get additional debug information to help figure out what's triggering this?
> 
> its pretty much in the hands of the agent.

Actually the message seems to be output by check_binary() function
which is part of framework.  

> you could perhaps find the call that looks for ip and wrap it in a set -x/set +x block
> that way you'd know exactly why it thinks the binary is missing
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org