[ClusterLabs] PostgreSQL PAF failover issue

Fri Jun 14 07:18:09 EDT 2019

Thank you, useful advice!

Logs are attached, they cover the period between when I set
maintenance-mode=false till after the node fencing.

On Fri, 14 Jun 2019 at 12:48, Jehan-Guillaume de Rorthais <jgdr at dalibo.com>
wrote:

> Hi,
>
> On Fri, 14 Jun 2019 12:27:12 +0200
> Tiemen Ruiten <t.ruiten at rdmedia.com> wrote:
> > I setup a new 3-node PostgreSQL cluster with HA managed by PAF. Nodes are
> > named ph-sql-03, ph-sql-04, ph-sql-05. Archive mode is on and writing
> > archive files to an NFS share that's mounted on all nodes using
> pgBackRest.
> >
> > What I did:
> > - Create a pacemaker cluster, cib.xml is attached.
> > - Set maintenance-mode=true in pacemaker
>
> This is not required. Just build your PgSQL replication, shut down the
> instances, then add the PAF resource to the cluster.
>
> But it's not very important here.
>
> > - Bring up ph-sql-03 with pg_ctl start
> > - Take a pg_basebackup on ph-sql-04 and ph-sql-05
> > - Create a recovery.conf on ph-sql-04 and ph-sql-05:
> >
> > standby_mode = 'on'
> > primary_conninfo = 'user=replication password=XXXXXXXXXXXXXXXX
> > application_name=ph-sql-0x host=10.100.130.20 port=5432 sslmode=prefer
> > sslcompression=0 krbsrvname=postgres target_session_attrs=any'
> > recovery_target_timeline = 'latest'
> > restore_command = 'pgbackrest --stanza=pgdb2 archive-get %f "%p"'
>
> Sounds fine.
>
> > - Bring up ph-sql-04 and ph-sql-05 and let recovery finish
> > - Set maintenance-mode=false in pacemaker
> > - Cluster is now running with ph-sql-03 as master and ph-sql-04/5 as
> slaves
> > At this point I tried a manual failover:
> > - pcs resource move --wait --master pgsql-ha ph-sql-04
> > Contrary to my expectations, pacemaker attempted to stop psqld on
> > ph-sql-03.
>
> Indeed. PostgreSQL doesn't support hot-demote. It has to be shut downed and
> started as a standby.
>
> > This took longer than the configured timeout of 60s (checkpoint
> > hadn't completed yet) and the node was fenced.
>
> 60s of checkpoint during a maintenance window? That's important indeed. I
> would
> command doing a manual checkpoint before triggering the move/switchover.
>
> > Then I ended up with
> > ph-sql-04 and ph-sql-05 both in slave mode and ph-sql-03 rebooting.
> >
> >  Master: pgsql-ha
> >   Meta Attrs: notify=true
> >   Resource: pgsqld (class=ocf provider=heartbeat type=pgsqlms)
> >    Attributes: bindir=/usr/pgsql-11/bin pgdata=/var/lib/pgsql/11/data
> > recovery_template=/var/lib/pgsql/recovery.conf.pcmk
> >    Operations: demote interval=0s timeout=30s (pgsqld-demote-interval-0s)
> >                methods interval=0s timeout=5 (pgsqld-methods-interval-0s)
> >                monitor interval=15s role=Master timeout=10s
> > (pgsqld-monitor-interval-15s)
> >                monitor interval=16s role=Slave timeout=10s
> > (pgsqld-monitor-interval-16s)
> >                notify interval=0s timeout=60s (pgsqld-notify-interval-0s)
> >                promote interval=0s timeout=30s
> (pgsqld-promote-interval-0s)
> >                reload interval=0s timeout=20 (pgsqld-reload-interval-0s)
> >                start interval=0s timeout=60s (pgsqld-start-interval-0s)
> >                stop interval=0s timeout=60s (pgsqld-stop-interval-0s)
> >
> > I understand I should at least increase the timeout of the stop operation
> > for psqld, though I'm not sure how much. Checkpoints can take up to 15
> > minutes to complete on this cluster. So is 20 minutes reasonable?
>
> 20 minutes is not reasonable for HA. 2 minutes is for manual procedure.
> Timeout are here so the cluster knows how to react during unexpected
> failure.
> Not during maintenance.
>
> As I wrote, just add a manual checkpoint in your switchover procedure
> before
> the actual move.
>
> > Any other operations I should increase the timeouts for?
> >
> > Why didn't pacemaker elect and promote one of the other nodes?
>
> Do you have logs of all nodes during this time period?
>
>

-- 
Tiemen Ruiten
Systems Engineer
R&D Media
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20190614/bc51f7b3/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: corosync_ph-sql-05.log
Type: application/octet-stream
Size: 261534 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20190614/bc51f7b3/attachment-0003.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: corosync_ph-sql-03.log
Type: application/octet-stream
Size: 329276 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20190614/bc51f7b3/attachment-0004.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: corosync_ph-sql-04.log
Type: application/octet-stream
Size: 209649 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20190614/bc51f7b3/attachment-0005.obj>