[ClusterLabs Developers] problem with master score limited to 1000000

Jehan-Guillaume de Rorthais jgdr at dalibo.com
Mon Apr 27 08:56:58 UTC 2015


Hi Andrew,

On Mon, 27 Apr 2015 07:06:36 +1000
Andrew Beekhof <andrew at beekhof.net> wrote:

> > On 25 Apr 2015, at 1:33 am, Jehan-Guillaume de Rorthais <jgdr at dalibo.com>
> > wrote:
> > 
> > We are writing a new resource agent for PostgreSQL (I am open to discuss why
> > offlist to keep the thread clean) and are experiencing some limitation
> > regarding to the master scoring in Pacemaker.
> > 
> > The only way in PostgreSQL to define which node should be promoted is to
> > compare their location in their transaction log (called LSN). This LSN is
> > expressed as a size that is obviously growing quickly.
> 
> We can look at bumping infinity, but what value would be acceptable?

I suppose most plateform support a value of 2^31-1 (~2 billion) as a simple 4
bytes signed integer. But I can see two issues with this:

  * could it break the compatibility with other RA expecting "inf" to be
    1,000,000?
  * it just move the limit farther, but it doesn't solve the real problem. 


In our situation, 2GB would probably be good in most situation, but consider
this scenario:

  * monitor interval is 10 sec
  * a table of 10GB is created on the master and streamed asynchronously to the
    slaves
  * the master crash

If at least 2GB has been streamed to the slaves, they will all have the same
"inf" value.

> Would using "seconds since X" be an option instead?

I don't understand what you mean. Does it apply to my problem or to the "inf"
consideration ? Could you elaborate ?


A solution we were discussing with my colleague was to be able to break the
current transition during the pre-promote and make sure a new transition is
computed where pre-promote is called again. This would allow the RA needing
complex election to have as many call of pre-promote as needed to take a
decision, without waiting for a "monitor" action to keep going with the
election process.

I noticed a transient attribute update already break a transition, like
crm_master does if I understand it correctly. But I'm not sure how to create
a custom transient attribute that would break the pre-promote for sure and
re-trigger it ? Could we create a "promote-step" attribute which would be
incremented as long as slaves are not happy with their election, re-triggering
the pre-promote each time ?

-- 
Jehan-Guillaume de Rorthais
Dalibo
http://www.dalibo.com




More information about the Developers mailing list