[Pacemaker] PE ignores monitor failure of stonith:external/rackpdu

Pavlos Parissis pavlos.parissis at gmail.com
Tue Nov 2 08:28:09 EDT 2010


On 2 November 2010 13:18, Dejan Muhamedagic <dejanmm at fastmail.fm> wrote:

> Hi,
>
> On Tue, Nov 02, 2010 at 01:09:02PM +0100, Pavlos Parissis wrote:
> > On 2 November 2010 13:02, Dejan Muhamedagic <dejanmm at fastmail.fm> wrote:
> > [...snip...]
> >
> > >
> > > > > Definitely not. If you do the monitor action from the command
> > > > > line does that also return the unexpected exit code:
> > > > >
> > > >
> > > > from the code I pasted you can see it returned 1.
> > >
> > > There is a difference. stonith-ng (stonithd) is a daemon that
> > > runs a perl script (fencing_legacy) which invokes stonith which
> > > then invokes the plugin. A problem can occur in any of these
> > > components. It's important to find out where.
> > >
> > > > > # stonith -t external/rackpdu community="empisteftiko"
> > > > > names_oid=".1.3.6.1.4.1.318.1.1.4.4.2.1.4" ... -lS
> > > > >
> > > > > Which pacemaker release do you run? I couldn't reproduce this
> > > > > with a recent Pacemaker.
> > > > >
> > > >
> > > > that it was on 1.1.3 and now I run 1.0.9.
> > > > Do you want me to run the test on 1.0.9?
> > >
> > > Yes, please. 1.0.9 is still running the old, and well tested,
> > > stonithd, so the result could be different.
> > >
> > >
> > I have the pdu off because it stopped working anymore! As a result the
> > resource is stopped.
> > But I did the test I see that even rackpdu returns 1 on status stonithd
> > reports 256
>
> Ah, I understand what's going on now. It's a bug in the interface
> to external plugins which was exposed by stonith-ng. It has been
> fixed in August. The fix is here (in hg.linux-ha.org/glue):
>
> changeset:   2427:b7df127fc09e
> user:        Dejan Muhamedagic <dejan at hello-penguin.com>
> date:        Thu Aug 12 14:01:10 2010 +0200
> summary:     High: stonith: external: interpret properly exit codes from
> external stonith plugins (bnc#630357)
>
> There hasn't been a glue release since then, but there should be
> one fairly soon. Note that this affects only Pacemaker 1.1.
>
> Thanks,
>
> Dejan
>
>
>
>
Does this bug have to do anything with PE ignoring monitor failure?
Pavlos
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20101102/b973f786/attachment-0001.html>


More information about the Pacemaker mailing list