[ClusterLabs] PAF fails to promote slave: Can not get current node LSN location

Tiemen Ruiten t.ruiten at tech-lab.io
Mon Jul 8 13:27:00 EDT 2019


On Mon, Jul 8, 2019 at 4:59 PM Jehan-Guillaume de Rorthais <jgdr at dalibo.com>
wrote:

> On Mon, 8 Jul 2019 13:56:49 +0200
> Tiemen Ruiten <t.ruiten at tech-lab.io> wrote:
>
> > Thank you for the clear explanation and advice.
> >
> > Hardware is adequate: 8x SSD and 20 cores per node, but I should note
> that
> > the filesystem is ZFS (stripe of mirrors) and there seems to be evidence
> > that the way the WAL writer allocates space and ZFS' Copy-on-Write nature
> > don't play nice. A patch that adds several GUCs to improve the situation
>
> Wait, how better performances on WAL writes will help you there?
> Checkpoints
> does not writes to WAL, it actually sync data from shared buffers to data
> files (heap, toast, index, internal stuffs, etc). Write performances to
> WAL is
> related to the number of xact you can achieve per seconds (if you have
> synchronous_commit >= local), not your checkpoint writes.
>

Wow, I completely misunderstood how that works then. This makes much more
sense (obviously..).


>
> > (at least it's worth trying, there was some disagreement on the
> > pgsql-general list over whether it would be helpful in my situation)
>
> Do you have a link to this thread ?
>

https://www.postgresql.org/message-id/flat/CAEkBuzeno6ztiM1g4WdzKRJFgL8b2nfePNU%3Dq3sBiEZUm-D-sQ%40mail.gmail.com

I managed to improve the average time checkpoints are taking already from
what I mentioned in that thread, mainly by decreasing checkpoint_timeout
and setting full_page_writes = off; ostensibly not necessary on ZFS.


>
> > has recently been merged: https://postgrespro.com/list/thread-id/2393057
> but
> > it won't be available in the 11.x release . So while I'm waiting until I
> can
> > upgrade to PostgreSQL 12, I'll increase the notify timeout.
>
> Do not hold your breath until you upgrade to 12, I'm not convinced (but I
> might miss something) this patch is useful to you.
>

Yes, since I have synchronous_commit = off; everywhere, I also don't see
anymore how it could help me.


>
> > A larger RTO is much more preferrable over manual actions in the middle
> of
> > the night!
>
> sure...but it depend on the usecase :)
>
> > Thanks again!
>
> You're welcome and good luck!
>
> --
> Jehan-Guillaume de Rorthais
> Dalibo
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20190708/1a68a4f8/attachment.html>


More information about the Users mailing list