[ClusterLabs] PostgreSQL cluster with Pacemaker+PAF problems
Aleksandra C
aleksandra29c at gmail.com
Thu Mar 5 06:21:14 EST 2020
Hello community,
I would be very happy to use some help from you.
I have configured PostgreSQL cluster with Pacemaker+PAF. The pacemaker
configuration is the following (from
https://clusterlabs.github.io/PAF/Quick_Start-CentOS-7.html)
# pgsqld
pcs -f cluster1.xml resource create pgsqld ocf:heartbeat:pgsqlms \
bindir=/usr/pgsql-9.6/bin pgdata=/var/lib/pgsql/9.6/data \
op start timeout=60s \
op stop timeout=60s \
op promote timeout=30s \
op demote timeout=120s \
op monitor interval=15s timeout=10s role="Master" \
op monitor interval=16s timeout=10s role="Slave" \
op notify timeout=60s
# pgsql-ha
pcs -f cluster1.xml resource master pgsql-ha pgsqld notify=true
pcs -f cluster1.xml resource create pgsql-master-ip ocf:heartbeat:IPaddr2 \
ip=192.168.122.50 cidr_netmask=24 op monitor interval=10s
pcs -f cluster1.xml constraint colocation add pgsql-master-ip with
master pgsql-ha INFINITY
pcs -f cluster1.xml constraint order promote pgsql-ha then start
pgsql-master-ip symmetrical=false kind=Mandatory
pcs -f cluster1.xml constraint order demote pgsql-ha then stop
pgsql-master-ip symmetrical=false kind=Mandatory
I use fence_xvm fencing agent, with the following configuration:
pcs -f cluster1.xml stonith create fence1 fence_xvm
pcmk_host_check="static-list" pcmk_host_list="srv1" port="srv-m1"
multicast_address=224.0.0.2
pcs -f cluster1.xml stonith create fence2 fence_xvm
pcmk_host_check="static-list" pcmk_host_list="srv2" port="srv-m2"
multicast_address=224.0.0.2
pcs -f cluster1.xml constraint location fence1 avoids srv1=INFINITY
pcs -f cluster1.xml constraint location fence2 avoids srv2=INFINITY
The cluster is behaving in strange way. When I manually fence the master
node (or ungracefully shutdown), after unfencing/starting, the node has
status Failed/blocked and the node is constantly fenced(restarted) by the
fencing agent. Should the fencing recover the cluster as Master/Slave
without problem? The error log say that the demote action on the node has
failed:
warning: Action 10 (pgsqld_demote_0) on server1 failed (target: 0 vs. rc:
1): Error
warning: Processing failed op demote for pgsqld:1 on server1: unknown error
(1)
warning: Forcing pgsqld:1 to stop after a failed demote action
Is this a cluster misconfiguration? Any idea would be greatly appreciated.
Thank you in advance,
Aleksandra
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20200305/84a550a9/attachment.htm>
More information about the Users
mailing list