[Pacemaker] pgsql RA - slave is in HS:ASYNC status and won; t promote

Fri Jan 17 14:45:46 EST 2014

Hi!
thanks for help .. anyway - my slave node is still async, event the select
you mentioned shows async .. at least i found out, that if i set rep_mode
to "async", the slave node gets promoted when master fails ...
so right now it is working, but i would like still know how to make
streaming replication synchronous .. i did everything as in mentioned wiki
page, but it is still async
any idea?
Thanks
Tomas

2014/1/14 東一彦 <higashi.kazuhiko at lab.ntt.co.jp>

> Hi,
>
>
> > but after some tests something went wrong and i don't know what and why
> and how to get it back working ... now when i start crm, master is PRI, but
> slave gets into HS:ASYNC state .. and when master fails, and slave gets
> into HS:alone state
> It is PostgreSQL to select the node whether "sync" or "async".
> pgsql RA displays a result of the following SQL.
>
>   select application_name,upper(state),upper(sync_state) from
> pg_stat_replication;
>
> So, at first, please watch PostgreSQL's log.
>
>
>
> Possibly the data may become inconsistent.
> You can resolve the inconsistency in the following operation.
>
>  http://clusterlabs.org/wiki/PgSQL_Replicated_Cluster#after_fail-over
>
>
> Regards,
> Kazuhiko HIGASHI
>
>
> (2014/01/10 17:48), Tomáš Vajrauch wrote:
>
>> Hi,
>>
>> i am trying to run postgresql cluster with streaming replication using
>> pgsql RA and pacemaker ..
>> i succeded once, master was as PRI, slave HS:sync, failover worked as it
>> should (slave become master) ..
>> but after some tests something went wrong and i don't know what and why
>> and how to get it back working ... now when i start crm, master is PRI, but
>> slave gets into HS:ASYNC state .. and when master fails, and slave gets
>> into HS:alone state
>>
>> can somebody please give me hint what should i do or what should i look
>> for?
>>
>> Thanks a lot for any help
>> Tomas
>>
>> my configuration:
>>
>> node jboss-test \
>>          attributes pgsql-data-status="LATEST"
>> node jboss-test2 \
>>          attributes pgsql-data-status="STREAMING|ASYNC"
>> primitive pgsql ocf:heartbeat:pgsql \
>>          params pgctl="/opt/postgres/9.3/bin/pg_ctl"
>> psql="/opt/postgres/9.3/bin/psql" pgdata="/opt/postgres/9.3/data/"
>> rep_mode="sync" node_list="jboss-test jboss-test2" restore_command="cp
>> /opt/postgres/9.3/data/pg_archive/%f %p" primary_conninfo_opt="keepalives_idle=60
>> keepalives_interval=5 keepalives_count=5" master_ip="172.16.111.120"
>> stop_escalate="0" \
>>          op start interval="0s" timeout="60s" on-fail="restart" \
>>          op stop interval="0s" timeout="60s" on-fail="block" \
>>          op monitor interval="11s" timeout="60s" on-fail="restart" \
>>          op monitor interval="10s" role="Master" timeout="60s"
>> on-fail="restart" \
>>          op promote interval="0s" timeout="60s" on-fail="restart" \
>>          op demote interval="0s" timeout="60s" on-fail="block" \
>>          op notify interval="0s" timeout="60s"
>> primitive pingCheck ocf:pacemaker:ping \
>>          params name="default_ping_set" host_list="172.16.0.1"
>> multiplier="100" \
>>          op start interval="0s" timeout="60s" on-fail="restart" \
>>          op monitor interval="2s" timeout="60s" on-fail="restart" \
>>          op stop interval="0s" timeout="60s" on-fail="ignore"
>> primitive vip-master ocf:heartbeat:IPaddr2 \
>>          params ip="172.16.111.110" nic="eth0" cidr_netmask="24" \
>>          op start interval="0s" timeout="60s" on-fail="restart" \
>>          op monitor interval="10s" timeout="60s" on-fail="restart" \
>>          op stop interval="0s" timeout="60s" on-fail="block"
>> primitive vip-rep ocf:heartbeat:IPaddr2 \
>>          params ip="172.16.111.120" nic="eth0" cidr_netmask="24" \
>>          meta migration-threshold="0" \
>>          op start interval="0s" timeout="60s" on-fail="stop" \
>>          op monitor interval="10s" timeout="60s" on-fail="restart" \
>>          op stop interval="0s" timeout="60s" on-fail="block"
>> primitive vip-slave ocf:heartbeat:IPaddr2 \
>>          params ip="172.16.111.111" nic="eth0" cidr_netmask="24" \
>>          meta resource-stickiness="1" \
>>          op start interval="0s" timeout="60s" on-fail="restart" \
>>          op monitor interval="10s" timeout="60s" on-fail="restart" \
>>          op stop interval="0s" timeout="60s" on-fail="block"
>> group master-group vip-master vip-rep \
>>          meta ordered="false"
>> ms msPostgresql pgsql \
>>          meta master-max="1" master-node-max="1" clone-max="2"
>> clone-node-max="1" notify="true"
>> clone clnPingCheck pingCheck
>> location rsc_location-1 vip-slave \
>>          rule $id="rsc_location-1-rule" 200: pgsql-status eq HS:sync \
>>          rule $id="rsc_location-1-rule-0" 190: pgsql-status eq HS:async \
>>          rule $id="rsc_location-1-rule-1" 100: pgsql-status eq PRI \
>>          rule $id="rsc_location-1-rule-2" -inf: not_defined pgsql-status \
>>          rule $id="rsc_location-1-rule-3" -inf: pgsql-status ne HS:sync
>> and pgsql-status ne PRI and pgsql-status ne HS:async
>> location rsc_location-2 msPostgresql \
>>          rule $id="rsc_location-3-rule" -inf: not_defined
>> default_ping_set or default_ping_set lt 100
>> colocation rsc_colocation-1 inf: msPostgresql clnPingCheck
>> colocation rsc_colocation-2 inf: master-group msPostgresql:Master
>> order rsc_order-1 0: clnPingCheck msPostgresql
>> order rsc_order-2 0: msPostgresql:promote master-group:start
>> symmetrical=false
>> order rsc_order-3 0: msPostgresql:demote master-group:stop
>> symmetrical=false
>> property $id="cib-bootstrap-options" \
>>          no-quorum-policy="ignore" \
>>          stonith-enabled="false" \
>>          crmd-transition-delay="0s" \
>>          dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
>>          cluster-infrastructure="openais" \
>>          expected-quorum-votes="2" \
>>          last-lrm-refresh="1389301940"
>> rsc_defaults $id="rsc-options" \
>>          resource-stickiness="INFINITY" \
>>          migration-threshold="1"
>>
>> crm_mon -Afr:
>> ============
>> Last updated: Fri Jan 10 09:46:29 2014
>> Last change: Fri Jan 10 09:46:29 2014 by root via crm_attribute on
>> jboss-test
>> Stack: openais
>> Current DC: jboss-test - partition with quorum
>> Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
>> 2 Nodes configured, 2 expected votes
>> 7 Resources configured.
>> ============
>>
>> Online: [ jboss-test jboss-test2 ]
>>
>> Full list of resources:
>>
>>   Clone Set: clnPingCheck [pingCheck]
>>       Started: [ jboss-test jboss-test2 ]
>>   Master/Slave Set: msPostgresql [pgsql]
>>       Masters: [ jboss-test ]
>>       Slaves: [ jboss-test2 ]
>> vip-slave       (ocf::heartbeat:IPaddr2):       Started jboss-test2
>>   Resource Group: master-group
>>       vip-master (ocf::heartbeat:IPaddr2):       Started jboss-test
>>       vip-rep    (ocf::heartbeat:IPaddr2):       Started jboss-test
>>
>> Node Attributes:
>> * Node jboss-test:
>>      + default_ping_set                  : 100
>>      + master-pgsql:0                    : 1000
>>      + pgsql-data-status                 : LATEST
>>      + pgsql-master-baseline             : 0000000039004DF0
>>      + pgsql-status                      : PRI
>> * Node jboss-test2:
>>      + default_ping_set                  : 100
>>      + master-pgsql:1                    : -INFINITY
>>      + pgsql-data-status                 : STREAMING|ASYNC
>>      + pgsql-status                      : HS:async
>>
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>>
>
> --
> ----------------------------------------------------
>  東 一彦
>   NTT OSSセンタ 基盤技術ユニット 高信頼担当
>   (SV総研 ソフトウェアイノベーションセンタ OSS推進PJ)
>  Mail:higashi.kazuhiko at lab.ntt.co.jp
>  Tel :03-5860-5135
>  〒108-8019 東京都港区港南1-9-1 NTT品川TWINSビル11階
> ----------------------------------------------------
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140117/8351ab03/attachment-0003.html>