[ClusterLabs] Gracefully delaying (cloned) resource startup
Andrew Beekhof
andrew at beekhof.net
Wed Feb 25 21:41:09 UTC 2015
> On 25 Feb 2015, at 7:34 pm, Vladislav Bogdanov <bubble at hoster-ok.com> wrote:
>
> 25.02.2015 10:20, Andrei Borzenkov wrote:
>> Consider replicated resource that is represented as master/slave. When
>> local RA starts and finds local resource in "primary" state it cannot
>> automatically assume resource should be master - it is possible to
>> have both ends in "primary" state after failover (e.g. after node
>> failure). Consider scenario:
>>
>> - node A runs primary (master)
>> - node A fails over to node B
>> - both nodes have to be switched off (power outage, maintenance work, ...)
>> - after switching on only node A comes up for whatever reason
>>
>> At this point local resource on node A is still in "primary" state,
>> but with stale content. So we need to wait until node B is actually
>> available to check state of resource on node B before we can take any
>> action. One possible action is to freeze until manual administrator
>> intervention ...
>>
>> I could not find how to implement it in pacemaker. What we can do is
>>
>> 1) pretend resource is started (by going to "slave") and actually
>> initiate resource startup in monitor script later.
>>
>> 2) fail startup request
>>
>> The former means reduced visibility - from user point of view resource
>> is started while it actually is not. The latter means that at some
>> point we exceed failure threshold and it will need manual
>> administrator intervention.
>>
>> What I'd actually like is the ability to say "delay startup until all
>> nodes are available" with some option to manually "force master" if
>> necessary. May be I miss something obvious here but I could not find
>> how it can be done.
>>
>> Thank you for any hints!
>
> May be wait-for-all option in corosync2 votequorum plugin and no-quorum-policy=freeze may help?
Yep, wait_for_all would work in this scenario.
It prevents the cluster from having quorum (and therefor starting resources) until all nodes have been seen.
>
>>
>> -andrei
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> http://clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Users
mailing list