[ClusterLabs] Postgresql+Pacemaker+Corosync unexpected behavior
Sergey Cherukhin
sergey.cherukhin at gmail.com
Mon Oct 2 00:43:21 EDT 2023
Hello!
I have configured Postgresql+Pacemaker+Corosync with 3 nodes, 2 of them
for Postgresql HA cluster and one as a witness.
3 nodes configured
4 resource instances configured
Online: [ witness wizard1 wizard2 ]
Full list of resources:
ClusterIP (ocf::heartbeat:IPaddr2): Started wizard1
Master/Slave Set: mspgsql [pgsql]
Masters: [ wizard1 ]
Slaves: [ wizard2 ]
ExternalIP (ocf::heartbeat:IPaddr2): Started wizard1
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
aptitude versions pacemaker corosync postgresql
Package corosync:
i A 2.4.2-3+deb9u1
stable
900
Package pacemaker:
i 1.1.24-0+deb9u1
stable
900
Package postgresql:
i A 9.6+200astra8
stable
900
After rebooting the slave, it joins to the cluster in this state:
Node Attributes:
* Node witness:
* Node wizard1:
+ master-pgsql : 1000
+ pgsql-data-status : LATEST
+ pgsql-master-baseline : 00000000070028D8
+ pgsql-status : PRI
* Node wizard2:
+ master-pgsql : -INFINITY
+ pgsql-data-status : STREAMING|ASYNC
+ pgsql-status : HS:async
although at the same time
postgres=# SELECT pid,usename,application_name,state,sync_state FROM
pg_stat_replication;
pid | usename | application_name | state | sync_state
------+----------+------------------+-----------+------------
6569 | postgres | wizard2 | streaming | sync
If I run the command "sudo pcs resource cleanup" on the slave, the cluster
goes into the state
Node Attributes:
* Node witness:
* Node wizard1:
+ master-pgsql : 1000
+ pgsql-data-status : LATEST
+ pgsql-master-baseline : 00000000070028D8
+ pgsql-status : PRI
* Node wizard2:
+ master-pgsql : 100
+ pgsql-data-status : STREAMING|SYNC
+ pgsql-status : HS:sync
Sometimes, after running "pcs resource cleanup", the value of master-pgsql
remains -INFINITY, in this case, after running "pcs resource cleanup"
again, master-pgsql takes the value 100.
What could be the cause for this behavior and how serious is it in terms of
data security? Postgresql claims that synchronous replication is running.
May I ignore this behavior?
And the second issue: when the slave is rebooted, an entry appears in the
Postgresql log
postgres at template1 FATAL: the database system is starting up
What does this mean?
Best regards,
Sergey Cherukhin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20231002/32965c2c/attachment.htm>
More information about the Users
mailing list