[ClusterLabs Developers] problem with master score limited to 1000000

Tue Apr 28 05:23:19 EDT 2015

On Tue, 28 Apr 2015 13:37:05 +1000
Andrew Beekhof <andrew at beekhof.net> wrote:

> > On 27 Apr 2015, at 11:10 pm, Jehan-Guillaume de Rorthais <jgdr at dalibo.com>
> > wrote:
> > 
> >>> A solution we were discussing with my colleague was to be able to break
> >>> the current transition during the pre-promote and make sure a new
> >>> transition is computed where pre-promote is called again.
> 
> Realistically, this is not going to happen in the next few years.
> 
> Regardless of the idea’s merits, its a major change to one of our core
> assumptions. Beyond the initial implementation, the fallout will last for
> months and I just don’t have that kind of bandwidth.

Well, we were looking for a solution with the current implementation of
Pacemaker anyway :)

If it's not possible to gently tell to the CRM that it should call pre-promote
again, then breaking the transition roughly is fine enough for us.

If I understand correctly "crmd/te_callbacks.c:te_legacy_update_diff()",
there's two situations where a transition is restarted: "Transient attribute:
update" and "Transient attribute: removal".

If one of these conditions is true, the following function is called:

   abort_transition(INFINITY, tg_restart, <cause here>, attr);

So, what exactly is a transient attribute? How could we create or set such
attribute? Is it possible?

> The idea is that by doing it in the monitor[1] op, you ensure you’re always
> in a position to do a promotion.
> By all means query attrd from the promote and/or pre-promote operations to
> ensure that the chosen node is still the correct one though.

We are unsure about the difference between, querying/setting an attribute using 
crm_attribute and querying/setting a attribute with attrd. 

what is the difference? How to make sure all the node updated their attribute
before taking a decision? How to set/query an attribute in attrd? attr_updater?

> Give the pre-promote a decent timeout and it can also act as your "waiting
> for writes to come in and all LSNs to be updated” buffer.
> 
> 
> [1] Strictly speaking, it could be any action name you dream up and tell the
> cluster to call on a recurring basis. Given that monitor is already defined
> and being called repeatedly, most people take the path of least resistance
> and use that (one less thing for an admin to mess up).

Our main goal was to keep the promotion negotiation going as long as the slaves
did not agree with each others about who is the new master, without
interruption. Without waiting for another round of monitor.

-- 
Jehan-Guillaume de Rorthais
Dalibo
http://www.dalibo.com