<div dir="ltr"><div dir="ltr">On Mon, Jul 8, 2019 at 4:59 PM Jehan-Guillaume de Rorthais <<a href="mailto:jgdr@dalibo.com">jgdr@dalibo.com</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Mon, 8 Jul 2019 13:56:49 +0200<br>

Tiemen Ruiten <<a href="mailto:t.ruiten@tech-lab.io" target="_blank">t.ruiten@tech-lab.io</a>> wrote:<br>

<br>

> Thank you for the clear explanation and advice.<br>

> <br>

> Hardware is adequate: 8x SSD and 20 cores per node, but I should note that<br>

> the filesystem is ZFS (stripe of mirrors) and there seems to be evidence<br>

> that the way the WAL writer allocates space and ZFS' Copy-on-Write nature<br>

> don't play nice. A patch that adds several GUCs to improve the situation<br>

<br>

Wait, how better performances on WAL writes will help you there? Checkpoints<br>

does not writes to WAL, it actually sync data from shared buffers to data<br>

files (heap, toast, index, internal stuffs, etc). Write performances to WAL is<br>

related to the number of xact you can achieve per seconds (if you have<br>

synchronous_commit >= local), not your checkpoint writes.<br></blockquote><div><br></div><div>Wow, I completely misunderstood how that works then. This makes much more sense (obviously..).</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

> (at least it's worth trying, there was some disagreement on the<br>

> pgsql-general list over whether it would be helpful in my situation)<br>

<br>

Do you have a link to this thread ?<br></blockquote><div><br></div><div><a href="https://www.postgresql.org/message-id/flat/CAEkBuzeno6ztiM1g4WdzKRJFgL8b2nfePNU%3Dq3sBiEZUm-D-sQ%40mail.gmail.com">https://www.postgresql.org/message-id/flat/CAEkBuzeno6ztiM1g4WdzKRJFgL8b2nfePNU%3Dq3sBiEZUm-D-sQ%40mail.gmail.com</a></div><div><br></div><div>I managed to improve the average time checkpoints are taking already from what I mentioned in that thread, mainly by decreasing checkpoint_timeout and setting full_page_writes = off; ostensibly not necessary on ZFS.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

> has recently been merged: <a href="https://postgrespro.com/list/thread-id/2393057" rel="noreferrer" target="_blank">https://postgrespro.com/list/thread-id/2393057</a> but<br>

> it won't be available in the 11.x release . So while I'm waiting until I can<br>

> upgrade to PostgreSQL 12, I'll increase the notify timeout. <br>

<br>

Do not hold your breath until you upgrade to 12, I'm not convinced (but I<br>

might miss something) this patch is useful to you.<br></blockquote><div><br></div><div>Yes, since I have synchronous_commit = off; everywhere, I also don't see anymore how it could help me.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

> A larger RTO is much more preferrable over manual actions in the middle of<br>

> the night!<br>

<br>

sure...but it depend on the usecase :)<br>

<br>

> Thanks again!<br>

<br>

You're welcome and good luck!<br><br>

-- <br>

Jehan-Guillaume de Rorthais<br>

Dalibo<br>

</blockquote></div></div>