[ClusterLabs] Antw: Re: Informing RAs about recovery: failed resource recovery, or any start-stop cycle?
Klaus Wenninger
kwenning at redhat.com
Fri May 20 08:58:04 UTC 2016
On 05/20/2016 08:39 AM, Ulrich Windl wrote:
>>>> Jehan-Guillaume de Rorthais <jgdr at dalibo.com> schrieb am 19.05.2016 um 21:29 in
> Nachricht <20160519212947.6cc0fd7b at firost>:
> [...]
>> I was thinking of a use case where a graceful demote or stop action failed
>> multiple times and to give a chance to the RA to choose another method to
>> stop
>> the resource before it requires a migration. As instance, PostgreSQL has 3
>> different kind of stop, the last one being not graceful, but still better
>> than
>> a kill -9.
> For example the Xen RA tries a clean shutdown with a timeout of about 2/3 of the timeout; it it fails it shuts the VM down the hard way.
>
> I don't know Postgres in detail, but I could imagine a three step approach:
> 1) Shutdown after current operations have finished
> 2) Shutdown regardless of pending operations (doing rollbacks)
> 3) Shutdown the hard way, requiring recovery on the next start (I think in Oracle this is called a "shutdown abort")
>
> Depending on the scenario one may start at step 2)
>
> [...]
> I think RAs should not rely on "stop" being called multiple times for a resource to be stopped.
I see a couple of positive points in having something inside pacemaker
that helps the RAs escalating
their stop strategy:
- this way you have the same logging for all RAs - done within the RA it
would look different with each of them
- timeout-retry stuff is potentially prone to not being implemented
properly - like this you have a proven
implementation within pacemaker
- keeps logic within RA simpler and guides implementation in a certain
direction that makes them look
more similar to each other making it easier to understand an RA you
haven't seen before
Of course there are basically two approaches to achieve this:
- give some global or per resource view of pacemaker to the RA and leave
it to the RA to act in a
responsible manner (like telling the RA that there are x stop-retries
to come)
- handle the escalation withing pacemaker and already tell the RA what
you expect it to do
like requesting a graceful / hard / emergency or however you would
call it stop
>
> Regards,
> Ulrich
>
>
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Users
mailing list