[ClusterLabs] Antw: Re: Informing RAs about recovery: failed resource recovery, or any start-stop cycle?

Fri May 20 18:04:43 CEST 2016

Klaus Wenninger <kwenning at redhat.com> wrote:
> On 05/20/2016 08:39 AM, Ulrich Windl wrote:
> >>>> Jehan-Guillaume de Rorthais <jgdr at dalibo.com> schrieb am 19.05.2016 um 21:29 in
> > Nachricht <20160519212947.6cc0fd7b at firost>:
> > [...]
> >> I was thinking of a use case where a graceful demote or stop action failed
> >> multiple times and to give a chance to the RA to choose another method to 
> >> stop
> >> the resource before it requires a migration. As instance, PostgreSQL has 3
> >> different kind of stop, the last one being not graceful, but still better 
> >> than
> >> a kill -9.
> >
> > For example the Xen RA tries a clean shutdown with a timeout of
> > about 2/3 of the timeout; it it fails it shuts the VM down the
> > hard way.
> >
> > I don't know Postgres in detail, but I could imagine a three step approach:
> > 1) Shutdown after current operations have finished
> > 2) Shutdown regardless of pending operations (doing rollbacks)
> > 3) Shutdown the hard way, requiring recovery on the next start (I think in Oracle this is called a "shutdown abort")
> >
> > Depending on the scenario one may start at step 2)
> >
> > [...]
> > I think RAs should not rely on "stop" being called multiple times for a resource to be stopped.

Well, this would be a major architectural change.  Currently if
stop fails once, the node gets fenced - period.  So if we changed
this, there would presumably be quite a bit of scope for making the
new design address whatever concerns you have about relying on "stop"
*sometimes* needing to be called multiple times.  For the sake of
backwards compatibility with existing RAs, I think we'd have to ensure
the current semantics still work.  But maybe there could be a new
option where RAs are allowed to return OCF_RETRY_STOP to indicate that
they want to escalate, or something.  However it's not clear how that
would be distinguished from an old RA returning the same value as
whatever we chose for OCF_RETRY_STOP.

> I see a couple of positive points in having something inside pacemaker
> that helps the RAs escalating
> their stop strategy:
> 
> - this way you have the same logging for all RAs - done within the RA it
> would look different with each of them
> - timeout-retry stuff is potentially prone to not being implemented
> properly - like this you have a proven
>   implementation within pacemaker
> - keeps logic within RA simpler and guides implementation in a certain
> direction that makes them look
>   more similar to each other making it easier to understand an RA you
> haven't seen before

Yes, all good points which I agree with.

> Of course there are basically two approaches to achieve this:
> 
> - give some global or per resource view of pacemaker to the RA and leave
> it to the RA to act in a
>   responsible manner (like telling the RA that there are x stop-retries
> to come)
> - handle the escalation withing pacemaker and already tell the RA what
> you expect it to do
>   like requesting a graceful / hard / emergency or however you would
> call it stop

I'd probably prefer the former, to avoid hardcoding any assumptions
about the different levels of escalation the RA might want to take.
That would almost certainly vary per RA.

However, we're slightly off-topic for this thread at this point ;-)