[ClusterLabs] PostgreSQL PAF failover issue

Fri Jun 14 10:04:31 EDT 2019

I've crossposted the question about checkpoints taking a long time to
pgsql-general as well :)

On Fri, 14 Jun 2019 at 15:05, Tiemen Ruiten <t.ruiten at rdmedia.com> wrote:

> Current size of the database is around 600GB uncompressed (LZ4 compression
> is enabled on the ZFS dataset).
>
> On Fri, 14 Jun 2019 at 14:59, Tiemen Ruiten <t.ruiten at rdmedia.com> wrote:
>
>> Hi, yes I'm also puzzled by this. The cluster is certainly not
>> underpowered, running on baremetal with 8x SSD in ZFS stripe of mirrors,
>> 128 GB RAM and shared_buffers is set to 8GB.
>>
>> other related settings:
>>
>> wal_buffers = 128MB
>> checkpoint_timeout = 60min
>> max_wal_size = 8GB
>> min_wal_size = 1GB
>> checkpoint_completion_target = 0.9
>>
>> I wonder if checkpoint_timeout should be lowered?
>>
>>
>> On Fri, 14 Jun 2019 at 14:49, Adrien Nayrat <adrien.nayrat at anayrat.info>
>> wrote:
>>
>>> On 6/14/19 12:27 PM, Tiemen Ruiten wrote:
>>> > This took longer than the configured timeout of 60s (checkpoint hadn't
>>> completed
>>> > yet) and the node was fenced.
>>>
>>>
>>> That's surprising checkpoint took longer than 60s. What is the size of
>>> your
>>> shared_buffers? What kind of hardware do you use (baremetal,
>>> virtualized...)?
>>>
>>>
>>> --
>>> Adrien
>>>
>>>
>>
>> --
>> Tiemen Ruiten
>> Systems Engineer
>> R&D Media
>>
>
>
> --
> Tiemen Ruiten
> Systems Engineer
> R&D Media
>

-- 
Tiemen Ruiten
Systems Engineer
R&D Media
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20190614/50fd33b5/attachment-0001.html>