[ClusterLabs Developers] problem with master score limited to 1000000

Andrew Beekhof andrew at beekhof.net
Wed May 20 19:42:19 EDT 2015

> On 20 May 2015, at 6:14 pm, Lars Ellenberg <Lars.Ellenberg at linbit.com> wrote:
> On Wed, May 20, 2015 at 09:27:41AM +1000, Andrew Beekhof wrote:
>>>> Well, we were looking for a solution with the current implementation of
>>>> Pacemaker anyway :)
>>>> If it's not possible to gently tell to the CRM that it should call pre-promote
>>>> again, then breaking the transition roughly is fine enough for us.
>>> We tried to complete the whole election process in only one call of
>>> pre-promote. During the call of pre-promote, the node-to-be-promoted is in
>>> charge to connect to all other postgresql instances to check if there is a
>>> better candidate. If it found a better one, it changes the scores calling
>>> crm_master.
>>> It kinda worked, but not as fast as we hoped. This PoC showed that the
>>> transition was broken AFTER the first promotion, not after the pre-promote
>>> action were all collected.
>> Sounds about right.
>> Thats why I wasn’t suggesting this as a foolproof approach - because you don’t get precise control over where processing stops. 
>>> Thus, slave1 being the lagging slave and slave2 the
>>> best candidate, we had:
>>> * slave1 promoted
>>> * slave1 demoted
>>> * slave 2 promoted
>>> This is actually a really bad scenario for us. We might still have the log
>>> files and transition files.
>>> Is it because the crm_master was called from the designated
>>> node-to-be-promoted ?
>> Nope. Its because there is scope for lots of things to happen between the update being sent and noticed.
>> Its also possible that the transition is hard-wired to run action X if the action X_pre_notifys were invoked.
>>> Is it possible to make sure the transition breakage happen as soon as the
>>> score change ? 
>> The only way to guarantee it is to allow notifications to fail.
> Why not fail the promotion instead?  I mean, do best effort to "help"
> pacemaker decide what you would like it to, but double check during
> promotion, and fail the promotion if you don't like it.
> Promotion failure is not a hard failure.
> But it will trigger a transition abort and pengine run.

You’ll get some additional recovery on the “failed” master though.
At lease a demote I’d imagine, possibly a restart.

> Ok, you don't get sub-second takeover to the "least lagging async replication slave".
> But if you need that, you need synchronous replication anyways.
> And if you have synchronous replication, you can tell pacemaker,
> and it will try to promote the synchronous instance first.
> -- 
> : Lars Ellenberg
> : http://www.LINBIT.com | Your Way to High Availability
> : DRBD, Linux-HA  and  Pacemaker support and consulting
> DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
> _______________________________________________
> Developers mailing list
> Developers at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/developers

More information about the Developers mailing list