[ClusterLabs] Postgres streaming VIP-REP not coming up on slave
Wynand Jansen van Vuuren
esawyja at gmail.com
Mon Mar 16 08:18:59 UTC 2015
Hi Nakahira,
Thanks so much for the info, this setting was as the wiki page suggested,
do you suggest that I take it out? or should I look at the problem where
cl2_lb1 is not being promoted?
Regards
On Mon, Mar 16, 2015 at 10:15 AM, NAKAHIRA Kazutomo <
nakahira_kazutomo_b1 at lab.ntt.co.jp> wrote:
> Hi,
>
> > Notice there is no VIPs, looks like the VIPs depends on some other
> resource
> > to start 1st?
>
> The following constraints means that "master-group" can not start
> without master of msPostgresql resource.
>
> colocation rsc_colocation-1 inf: master-group msPostgresql:Master
>
> After you power off cl1_lb1, msPostgresql on the cl2_lb1 is not promoted
> and master is not exist in your cluster.
>
> It means that "master-group" can not run anyware.
>
> Best regards,
> Kazutomo NAKAHIRA
>
>
> On 2015/03/16 16:48, Wynand Jansen van Vuuren wrote:
>
>> Hi
>> When I start out cl1_lb1 (Cluster 1 load balancer 1) is the master as
>> below
>> cl1_lb1:~ # crm_mon -1 -Af
>> Last updated: Mon Mar 16 09:44:44 2015
>> Last change: Mon Mar 16 08:06:26 2015 by root via crm_attribute on cl1_lb1
>> Stack: classic openais (with plugin)
>> Current DC: cl2_lb1 - partition with quorum
>> Version: 1.1.9-2db99f1
>> 2 Nodes configured, 2 expected votes
>> 6 Resources configured.
>>
>>
>> Online: [ cl1_lb1 cl2_lb1 ]
>>
>> Resource Group: master-group
>> vip-master (ocf::heartbeat:IPaddr2): Started cl1_lb1
>> vip-rep (ocf::heartbeat:IPaddr2): Started cl1_lb1
>> CBC_instance (ocf::heartbeat:cbc): Started cl1_lb1
>> failover_MailTo (ocf::heartbeat:MailTo): Started cl1_lb1
>> Master/Slave Set: msPostgresql [pgsql]
>> Masters: [ cl1_lb1 ]
>> Slaves: [ cl2_lb1 ]
>>
>> Node Attributes:
>> * Node cl1_lb1:
>> + master-pgsql : 1000
>> + pgsql-data-status : LATEST
>> + pgsql-master-baseline : 00000008B90061F0
>> + pgsql-status : PRI
>> * Node cl2_lb1:
>> + master-pgsql : 100
>> + pgsql-data-status : STREAMING|SYNC
>> + pgsql-status : HS:sync
>>
>> Migration summary:
>> * Node cl2_lb1:
>> * Node cl1_lb1:
>> cl1_lb1:~ #
>>
>> If I then do a power off on cl1_lb1 (master), Postgres moves to cl2_lb1
>> (Cluster 2 load balancer 1), but the VIP-MASTER and VIP-REP is not
>> pingable
>> from the NEW master (cl2_lb1), it stays line this below
>> cl2_lb1:~ # crm_mon -1 -Af
>> Last updated: Mon Mar 16 07:32:07 2015
>> Last change: Mon Mar 16 07:28:53 2015 by root via crm_attribute on cl1_lb1
>> Stack: classic openais (with plugin)
>> Current DC: cl2_lb1 - partition WITHOUT quorum
>> Version: 1.1.9-2db99f1
>> 2 Nodes configured, 2 expected votes
>> 6 Resources configured.
>>
>>
>> Online: [ cl2_lb1 ]
>> OFFLINE: [ cl1_lb1 ]
>>
>> Master/Slave Set: msPostgresql [pgsql]
>> Slaves: [ cl2_lb1 ]
>> Stopped: [ pgsql:1 ]
>>
>> Node Attributes:
>> * Node cl2_lb1:
>> + master-pgsql : -INFINITY
>> + pgsql-data-status : DISCONNECT
>> + pgsql-status : HS:alone
>>
>> Migration summary:
>> * Node cl2_lb1:
>> cl2_lb1:~ #
>>
>> Notice there is no VIPs, looks like the VIPs depends on some other
>> resource
>> to start 1st?
>> Thanks for the reply!
>>
>>
>> On Mon, Mar 16, 2015 at 9:42 AM, NAKAHIRA Kazutomo <
>> nakahira_kazutomo_b1 at lab.ntt.co.jp> wrote:
>>
>> Hi,
>>>
>>> fine, cl2_lb1 takes over and acts as a slave, but the VIPs does not come
>>>>
>>>
>>> cl2_lb1 acts as a slave? It is not a master?
>>> VIPs comes up with master msPostgresql resource.
>>>
>>> If promote action was failed in the cl2_lb1, then
>>> please send a ha-log and PostgreSQL's log.
>>>
>>> Best regards,
>>> Kazutomo NAKAHIRA
>>>
>>>
>>> On 2015/03/16 16:09, Wynand Jansen van Vuuren wrote:
>>>
>>> Hi all,
>>>>
>>>> I have 2 nodes, with 2 interfaces each, ETH0 is used for an application,
>>>> CBC, that's writing to the Postgres DB on the VIP-MASTER 172.28.200.159,
>>>> ETH1 is used for the Corosync configuration and VIP-REP, everything
>>>> works,
>>>> but if the master currently on cl1_lb1 has a catastrophic failure, like
>>>> power down, the VIPs does not start on the slave, the Postgres parts
>>>> works
>>>> fine, cl2_lb1 takes over and acts as a slave, but the VIPs does not come
>>>> up. If I test it manually, IE kill the application 3 times on the
>>>> master,
>>>> the switchover is smooth, same if I kill Postgres on master, but when
>>>> there
>>>> is a power failure on the Master, the VIPs stay down. If I then delete
>>>> the
>>>> attributes pgsql-data-status="LATEST" and attributes
>>>> pgsql-data-status="STREAMING|SYNC" on the slave after power off on the
>>>> master and restart everything, then the VIPs come up on the slave, any
>>>> ideas please?
>>>> I'm using this setup
>>>> http://clusterlabs.org/wiki/PgSQL_Replicated_Cluster
>>>>
>>>> With this configuration below
>>>> node cl1_lb1 \
>>>> attributes pgsql-data-status="LATEST"
>>>> node cl2_lb1 \
>>>> attributes pgsql-data-status="STREAMING|SYNC"
>>>> primitive CBC_instance ocf:heartbeat:cbc \
>>>> op monitor interval="60s" timeout="60s" on-fail="restart" \
>>>> op start interval="0s" timeout="60s" on-fail="restart" \
>>>> meta target-role="Started" migration-threshold="3"
>>>> failure-timeout="60s"
>>>> primitive failover_MailTo ocf:heartbeat:MailTo \
>>>> params email="wynandj at rorotika.com" subject="Cluster Status
>>>> change
>>>> - " \
>>>> op monitor interval="10" timeout="10" dept="0"
>>>> primitive pgsql ocf:heartbeat:pgsql \
>>>> params pgctl="/opt/app/PostgreSQL/9.3/bin/pg_ctl"
>>>> psql="/opt/app/PostgreSQL/9.3/bin/psql"
>>>> config="/opt/app/pgdata/9.3/postgresql.conf" pgdba="postgres"
>>>> pgdata="/opt/app/pgdata/9.3/" start_opt="-p 5432" rep_mode="sync"
>>>> node_list="cl1_lb1 cl2_lb1" restore_command="cp /pgtablespace/archive/%f
>>>> %p" primary_conninfo_opt="keepalives_idle=60 keepalives_interval=5
>>>> keepalives_count=5" master_ip="172.16.0.5" restart_on_promote="false"
>>>> logfile="/var/log/OCF.log" \
>>>> op start interval="0s" timeout="60s" on-fail="restart" \
>>>> op monitor interval="4s" timeout="60s" on-fail="restart" \
>>>> op monitor interval="3s" role="Master" timeout="60s"
>>>> on-fail="restart" \
>>>> op promote interval="0s" timeout="60s" on-fail="restart" \
>>>> op demote interval="0s" timeout="60s" on-fail="stop" \
>>>> op stop interval="0s" timeout="60s" on-fail="block" \
>>>> op notify interval="0s" timeout="60s"
>>>> primitive vip-master ocf:heartbeat:IPaddr2 \
>>>> params ip="172.28.200.159" nic="eth0" iflabel="CBC_VIP"
>>>> cidr_netmask="24" \
>>>> op start interval="0s" timeout="60s" on-fail="restart" \
>>>> op monitor interval="10s" timeout="60s" on-fail="restart" \
>>>> op stop interval="0s" timeout="60s" on-fail="block" \
>>>> meta target-role="Started"
>>>> primitive vip-rep ocf:heartbeat:IPaddr2 \
>>>> params ip="172.16.0.5" nic="eth1" iflabel="REP_VIP"
>>>> cidr_netmask="24" \
>>>> meta migration-threshold="0" target-role="Started" \
>>>> op start interval="0s" timeout="60s" on-fail="stop" \
>>>> op monitor interval="10s" timeout="60s" on-fail="restart" \
>>>> op stop interval="0s" timeout="60s" on-fail="restart"
>>>> group master-group vip-master vip-rep CBC_instance failover_MailTo
>>>> ms msPostgresql pgsql \
>>>> meta master-max="1" master-node-max="1" clone-max="2"
>>>> clone-node-max="1" notify="true"
>>>> colocation rsc_colocation-1 inf: master-group msPostgresql:Master
>>>> order rsc_order-1 0: msPostgresql:promote master-group:start
>>>> symmetrical=false
>>>> order rsc_order-2 0: msPostgresql:demote master-group:stop
>>>> symmetrical=false
>>>> property $id="cib-bootstrap-options" \
>>>> dc-version="1.1.9-2db99f1" \
>>>> cluster-infrastructure="classic openais (with plugin)" \
>>>> expected-quorum-votes="2" \
>>>> no-quorum-policy="ignore" \
>>>> stonith-enabled="false" \
>>>> cluster-recheck-interval="1min" \
>>>> crmd-transition-delay="0s" \
>>>> last-lrm-refresh="1426485983"
>>>> rsc_defaults $id="rsc-options" \
>>>> resource-stickiness="INFINITY" \
>>>> migration-threshold="1"
>>>> #vim:set syntax=pcmk
>>>>
>>>> Any ideas please, I'm lost......
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Users mailing list: Users at clusterlabs.org
>>>> http://clusterlabs.org/mailman/listinfo/users
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/
>>>> doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Users mailing list: Users at clusterlabs.org
>>> http://clusterlabs.org/mailman/listinfo/users
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>>
>>>
>>
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> http://clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>>
>
> --
> NTT オープンソースソフトウェアセンタ
> 中平 和友
> TEL: 03-5860-5135 FAX: 03-5463-6490
> Mail: nakahira_kazutomo_b1 at lab.ntt.co.jp
>
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20150316/bafa1c37/attachment.htm>
More information about the Users
mailing list