[ClusterLabs] PostgreSQL Automatic Failover (PAF) v2.2.0

Thu Oct 5 21:36:34 UTC 2017

On Thu, 5 Oct 2017 21:24:36 +0200
Valentin Vidic <Valentin.Vidic at CARNet.hr> wrote:

> On Thu, Oct 05, 2017 at 08:55:59PM +0200, Jehan-Guillaume de Rorthais wrote:
> > It doesn't seems impossible, however I'm not sure of the complexity around
> > this.
> > 
> > You would have to either hack PAF and detect failover/migration or create a
> > new RA that would always be part of the transition implying your PAF RA to
> > define if it is moving elsewhere or not. 
> > 
> > It feels the complexity is quite high and would require some expert advices
> > about Pacemaker internals to avoid wrong or unrelated behaviors or race
> > conditions.
> > 
> > But, before going farther, you need to realize a failover will never be
> > transparent. Especially one that would trigger randomly outside of your
> > control.  
> 
> Yes, I was thinking more about manual failover, for example to upgrade
> the postgresql master.  RA for pgbouncer would wait for all active
> queries to finish and queue all new queries.  Once there is nothing
> running on the master anymore, another slave is activated and pgbouncer
> would than resume queries there.

OK. Then for a manual and controlled switchover, I suppose the best option is
to keep things simple and add two more steps to your blueprint:

* one to pause the client connection before the "pcs resource move --master
  --wait <rsc_id> [node]"
* one to resume them as soon as the "pcs resource move" finised.

Obviously, this could be scripted to make controls, checks and actions faster.