<div dir="ltr"><div dir="ltr">On Mon, Jul 8, 2019 at 4:59 PM Jehan-Guillaume de Rorthais <<a href="mailto:jgdr@dalibo.com">jgdr@dalibo.com</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Mon, 8 Jul 2019 13:56:49 +0200<br>
Tiemen Ruiten <<a href="mailto:t.ruiten@tech-lab.io" target="_blank">t.ruiten@tech-lab.io</a>> wrote:<br>
<br>
> Thank you for the clear explanation and advice.<br>
> <br>
> Hardware is adequate: 8x SSD and 20 cores per node, but I should note that<br>
> the filesystem is ZFS (stripe of mirrors) and there seems to be evidence<br>
> that the way the WAL writer allocates space and ZFS' Copy-on-Write nature<br>
> don't play nice. A patch that adds several GUCs to improve the situation<br>
<br>
Wait, how better performances on WAL writes will help you there? Checkpoints<br>
does not writes to WAL, it actually sync data from shared buffers to data<br>
files (heap, toast, index, internal stuffs, etc). Write performances to WAL is<br>
related to the number of xact you can achieve per seconds (if you have<br>
synchronous_commit >= local), not your checkpoint writes.<br></blockquote><div><br></div><div>Wow, I completely misunderstood how that works then. This makes much more sense (obviously..).</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
> (at least it's worth trying, there was some disagreement on the<br>
> pgsql-general list over whether it would be helpful in my situation)<br>
<br>
Do you have a link to this thread ?<br></blockquote><div><br></div><div><a href="https://www.postgresql.org/message-id/flat/CAEkBuzeno6ztiM1g4WdzKRJFgL8b2nfePNU%3Dq3sBiEZUm-D-sQ%40mail.gmail.com">https://www.postgresql.org/message-id/flat/CAEkBuzeno6ztiM1g4WdzKRJFgL8b2nfePNU%3Dq3sBiEZUm-D-sQ%40mail.gmail.com</a></div><div><br></div><div>I managed to improve the average time checkpoints are taking already from what I mentioned in that thread, mainly by decreasing checkpoint_timeout and setting full_page_writes = off; ostensibly not necessary on ZFS.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
> has recently been merged: <a href="https://postgrespro.com/list/thread-id/2393057" rel="noreferrer" target="_blank">https://postgrespro.com/list/thread-id/2393057</a> but<br>
> it won't be available in the 11.x release . So while I'm waiting until I can<br>
> upgrade to PostgreSQL 12, I'll increase the notify timeout. <br>
<br>
Do not hold your breath until you upgrade to 12, I'm not convinced (but I<br>
might miss something) this patch is useful to you.<br></blockquote><div><br></div><div>Yes, since I have synchronous_commit = off; everywhere, I also don't see anymore how it could help me.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
> A larger RTO is much more preferrable over manual actions in the middle of<br>
> the night!<br>
<br>
sure...but it depend on the usecase :)<br>
<br>
> Thanks again!<br>
<br>
You're welcome and good luck!<br><br>
-- <br>
Jehan-Guillaume de Rorthais<br>
Dalibo<br>
</blockquote></div></div>