[ClusterLabs] PAF fails to promote slave: Can not get current node LSN location

Jehan-Guillaume de Rorthais jgdr at dalibo.com
Tue Jul 9 10:21:11 EDT 2019


On Tue, 9 Jul 2019 13:22:06 +0200
Tiemen Ruiten <t.ruiten at tech-lab.io> wrote:

> On Mon, Jul 8, 2019 at 10:01 PM Jehan-Guillaume de Rorthais <jgdr at dalibo.com>
...
> > I dig in xlog.c today. Maybe I can write a small extension to get the
> > timeline
> > from shared memory directly and make pgsqlms use it if it detects it. So
> > people
> > can decide if they feel like it is too invasive or really needed for
> > their usecase. Maybe in next release. What do you think? Would it be
> > useful to
> > you?
> >  
> 
> Yes, that would be a really useful addition IMO. I would definitely use it.
> If we can avoid taking a checkpoint that will save precious minutes during
> a failover and the risk of timeouts would be drastically reduced. Would be
> happy to test it if you want!

OK, thanks. Not sure when I'll have time to work on this. But I'll stay in
touch with you then.

I have to work on the v12 support as well :/

> > > I managed to improve the average time checkpoints are taking already from
> > > what I mentioned in that thread, mainly by decreasing checkpoint_timeout
> > > and setting full_page_writes = off; ostensibly not necessary on ZFS.  
> >
> > The "full_page_writes" helps lowering the amount of WAL produced. Not the
> > amount of writes to sync during the checkpoint. But I am sure it helps for
> > your performances :)
> 
> If I'm saturating the IO capacity of my system during a forced checkpoint
> and full_page_writes = off reduces IO by reducing the amount of WAL, then
> it should help in an indirect way?

The master is supposed to be gone during a failover, neither in reads or
writes. The checkpoint occurs on each standby to force sync their
controldata. The checkpoint itself does not writes to WALs or read them. Am I
forgetting something obvious?

Maybe you can have some writes if the standby need to sync last received
WALs and some reads if the standby was lagging on replay...But it shouldn't be
much...

-- 
Jehan-Guillaume de Rorthais
Dalibo


More information about the Users mailing list