[Pacemaker] unbound resource agent

Dejan Muhamedagic dejanmm at fastmail.fm
Thu Mar 15 04:41:58 EDT 2012


Hi,

On Wed, Mar 14, 2012 at 10:35:01PM +0100, Arnold Krille wrote:
> On Wednesday 14 March 2012 17:52:21 Dejan Muhamedagic wrote:
> > On Wed, Mar 14, 2012 at 02:48:11PM +0100, Benjamin Kiessling wrote:
> > > Hi,
> > > 
> > > On 2012.03.14 14:24:10 +0100, Dejan Muhamedagic wrote:
> > > > > dnsCache_start_0 (node=router1, call=56, rc=-2, status=Timed Out):
> > > > > unknown exec error dnsCache_monitor_1000 (node=router2, call=24,
> > > > > rc=1, status=complete): unknown error
> > This one exited with a generic error. Didn't notice that. The RA
> > should've logged the reason.
> > > > > dnsCache_start_0 (node=router2, call=81, rc=-2, status=Timed Out):
> > > > > unknown exec error> > 
> > > > These operations timed out, i.e. didn't finish in the given time
> > > > frame which is by default 20 seconds.
> > > 
> > > It says the return code is -2 which isn't a return code specified in the
> > > OCF standard. unbound usually starts fast and I can't see anything in
> > > the logs indicating an error during initialization.
> > 
> > Negative exit codes are special and cannot be produced by a
> > script.
> 
> Negative exit-codes are "special" in that they commonly denote an error while 
> positive exit-codes might be regular results of the app/script running.
> And there is no difference between a script and a "real" program when it comes 
> to returning exit-codes.
> 
> You might mean that either the RA-script or the cluster-software itself can't 
> return negative exit-codes...

I meant that an RA (should've said that instead of "script")
cannot return a negative exit code.

> > Hmm, I've always thought that "Timed Out" in that
> > message above is unequivocal.
> 
> "Timed out" is one of the errors. And when you have some positive exit-codes 
> for "the script went well but the state of the resource is <bla>", its 
> perfectly okay to use negative exit-codes to signal things like "the RA script 
> didn't execute" or "the RA script took to long to execute"...

Eh?

A positive exit code always comes from the RA. A negative one
from the lrmd which means that for whatever reason the RA
instance couldn't run or didn't finish.

Thanks,

Dejan

> Arnold



> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org





More information about the Pacemaker mailing list