[ClusterLabs] Postgres never promoted
Alexandre
alxgomz at gmail.com
Fri Feb 20 19:59:15 UTC 2015
Hi list,
I am facing a very strange issue.
I have setup a postgresql cluster (with streaming repl).
The replication works ok when started manually but the RA seems to never
promote any host where the resource is started.
I am running pacemaker 1.12 on centos 6.6 (and I added crmsh from an
opensuse repo, as I am used to it)
my config is bellow:
node pp-obm-sgbd.upond.fr
node pp-obm-sgbd2.upond.fr \
attributes pri_pgsql-data-status=DISCONNECT
primitive pri_obm-locator lsb:obm-locator \
params \
op start interval=0s timeout=60s \
op stop interval=0s timeout=60s \
op monitor interval=10s timeout=20s
primitive pri_pgsql pgsql \
params pgctl="/usr/pgsql-9.1/bin/pg_ctl" psql="/usr/pgsql-9.1/bin/psql"
pgdata="/var/lib/pgsql/9.1/data/" node_list="pp-obm-sgbd.upond.fr
pp-obm-sgbd2.upond.fr" repuser=replication rep_mode=sync
restart_on_promote=true restore_command="cp /var/lib/pgsql/replication/%f
%p" primary_conninfo_opt="keepalives_idle=60 keepalives_interval=5
keepalives_count=5" master_ip=193.50.151.200 \
op start interval=0 on-fail=restart timeout=120s \
op monitor interval=20s on-fail=restart timeout=60s \
op monitor interval=15s on-fail=restart role=Master timeout=60s \
op promote interval=0 on-fail=restart timeout=120s \
op demote interval=0 on-fail=stop timeout=120s \
op notify interval=0s timeout=60s \
op stop interval=0 on-fail=block timeout=120s
primitive pri_vip IPaddr2 \
params ip=193.50.151.200 nic=eth1 cidr_netmask=32 \
op start interval=0s timeout=60s \
op monitor interval=10s timeout=60s \
op stop interval=0s timeout=60s
ms ms_pgsql pri_pgsql \
meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1
colocation clc_vip-ms_pgsql inf: pri_vip:Started ms_pgsql:Master
order ord_dm_pgsql-vip 0: ms_pgsql:demote pri_vip:stop
order ord_pm_pgsql-vip 0: ms_pgsql:promote pri_vip:start symmetrical=false
property cib-bootstrap-options: \
dc-version=1.1.11-97629de \
cluster-infrastructure=cman \
last-lrm-refresh=1424459378 \
no-quorum-policy=ignore \
stonith-enabled=false \
maintenance-mode=false
rsc_defaults rsc_defaults-options: \
resource-stickiness=1000 \
migration-threshold=5
crm_mon shows both hosts as slaves and none is never promoted ever:
Master/Slave Set: ms_pgsql [pri_pgsql]
Slaves: [ pp-obm-sgbd.upond.fr pp-obm-sgbd2.upond.fr ]
Node Attributes:
* Node pp-obm-sgbd.upond.fr:
+ master-pri_pgsql : 1000
+ pri_pgsql-status : HS:alone
+ pri_pgsql-xlog-loc : 000000002D000078
* Node pp-obm-sgbd2.upond.fr:
+ master-pri_pgsql : -INFINITY
+ pri_pgsql-data-status : DISCONNECT
+ pri_pgsql-status : HS:alone
+ pri_pgsql-xlog-loc : 000000002D000000
on the host I am expecting promotion I see when doing cleanups:
Feb 20 20:15:07 pp-obm-sgbd pgsql(pri_pgsql)[30994]: INFO: Master does not
exist.
Feb 20 20:15:07 pp-obm-sgbd pgsql(pri_pgsql)[30994]: INFO: My data status=.
And on the other node I see the following logs that sounds interrseting:
Feb 20 20:16:10 pp-obm-sgbd2 crmd[19626]: notice: print_synapse:
[Action 18]: Pending pseudo op ms_pgsql_promoted_0 on N/A
(priority: 1000000, waiting: 11)
Feb 20 20:16:10 pp-obm-sgbd2 crmd[19626]: notice: print_synapse:
[Action 17]: Pending pseudo op ms_pgsql_promote_0 on N/A
(priority: 0, waiting: 21)
the N/A part seems to tell me the cluster don't know where to promote the
resource but I can't understand why.
bellow are my constraint rules:
pcs constraint show
Location Constraints:
Ordering Constraints:
demote ms_pgsql then stop pri_vip (score:0)
promote ms_pgsql then start pri_vip (score:0) (non-symmetrical)
Colocation Constraints:
pri_vip with ms_pgsql (score:INFINITY) (rsc-role:Started)
(with-rsc-role:Master)
I am now out of ideas so any help is very much appreciated.
Regards.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20150220/78ddec44/attachment-0003.html>
More information about the Users
mailing list