[Pacemaker] PE ignores monitor failure of stonith:external/rackpdu

Dejan Muhamedagic dejanmm at fastmail.fm
Tue Nov 2 10:19:47 EDT 2010


On Tue, Nov 02, 2010 at 01:28:09PM +0100, Pavlos Parissis wrote:
> On 2 November 2010 13:18, Dejan Muhamedagic <dejanmm at fastmail.fm> wrote:
> 
> > Hi,
> >
> > On Tue, Nov 02, 2010 at 01:09:02PM +0100, Pavlos Parissis wrote:
> > > On 2 November 2010 13:02, Dejan Muhamedagic <dejanmm at fastmail.fm> wrote:
> > > [...snip...]
> > >
> > > >
> > > > > > Definitely not. If you do the monitor action from the command
> > > > > > line does that also return the unexpected exit code:
> > > > > >
> > > > >
> > > > > from the code I pasted you can see it returned 1.
> > > >
> > > > There is a difference. stonith-ng (stonithd) is a daemon that
> > > > runs a perl script (fencing_legacy) which invokes stonith which
> > > > then invokes the plugin. A problem can occur in any of these
> > > > components. It's important to find out where.
> > > >
> > > > > > # stonith -t external/rackpdu community="empisteftiko"
> > > > > > names_oid=".1.3.6.1.4.1.318.1.1.4.4.2.1.4" ... -lS
> > > > > >
> > > > > > Which pacemaker release do you run? I couldn't reproduce this
> > > > > > with a recent Pacemaker.
> > > > > >
> > > > >
> > > > > that it was on 1.1.3 and now I run 1.0.9.
> > > > > Do you want me to run the test on 1.0.9?
> > > >
> > > > Yes, please. 1.0.9 is still running the old, and well tested,
> > > > stonithd, so the result could be different.
> > > >
> > > >
> > > I have the pdu off because it stopped working anymore! As a result the
> > > resource is stopped.
> > > But I did the test I see that even rackpdu returns 1 on status stonithd
> > > reports 256
> >
> > Ah, I understand what's going on now. It's a bug in the interface
> > to external plugins which was exposed by stonith-ng. It has been
> > fixed in August. The fix is here (in hg.linux-ha.org/glue):
> >
> > changeset:   2427:b7df127fc09e
> > user:        Dejan Muhamedagic <dejan at hello-penguin.com>
> > date:        Thu Aug 12 14:01:10 2010 +0200
> > summary:     High: stonith: external: interpret properly exit codes from
> > external stonith plugins (bnc#630357)
> >
> > There hasn't been a glue release since then, but there should be
> > one fairly soon. Note that this affects only Pacemaker 1.1.
> >
> > Thanks,
> >
> > Dejan
> >
> >
> >
> >
> Does this bug have to do anything with PE ignoring monitor failure?

The PE doesn't ignore the failure because it doesn't see it. The
exit code 256 is actually encoded as 0 so, as far as the crmd
and PE are concerned everything is OK.

Thanks,

Dejan

> Pavlos

> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker





More information about the Pacemaker mailing list