[ClusterLabs] mayhem when exiting maintenance mode

Tue Feb 4 09:39:04 EST 2020

We have a three-node postgres cluster running on Ubuntu 14.04, currently at
Postgres 9.5 with Corosync 2.4.2 and Pacemaker 1.1.18.

I'm trying to automate upgrading the database to 11.4.  (Our product is a
network appliance, so it needs to be automated for our customers)

I first put the cluster into maintenance mode, perform the upgrade, update
the resource paths in the crm config to point to the new db instance,
restore the db from the old version (required by postgres to do major
version upgrades).  At the end of all these steps everything looks good.

But when I turn off maintenance mode all of my db nodes suddenly go down
and all three appear to be in slave mode, with no master.  If I wait a few
minutes it appears that node 2 takes over as master, but it has an empty
database, because apparently it wasn't able to replicate the restored db
from the original master yet.  Can anyone tell me what is causing this?

Derek Viljoen
derekv at infinite.io
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20200204/2e23a52c/attachment.html>