[Pacemaker] Postresql streaming replication support

Keith Ouellette Keith.Ouellette at Airgas.com
Fri Jan 11 14:40:41 EST 2013


Thank you again. It looks like it is trying now, but when I use the virtual IP I get the following in the postgres log (../pg_log/):

cp: cannot stat '/opt/PostgreSQL/9.2/data/archive/000000010000000100000021': No such file or directory
cp: cannot stat '/opt/PostgreSQL/9.2/data/archive/000000010000000100000021': No such file or directory
cp: cannot stat '/opt/PostgreSQL/9.2/data/archive/00000002.history': No such file or directory
2013-01-11 10:22:06 EST FATAL:  could not connect to the primary server: could not connect to server: No route to host
                Is the server running on host "172.16.0.110" and accepting
                TCP/IP connections on port 5432?

If I put the IP of one fo the servers (former master before pacemaker), I get the following:

cp: cannot stat '/opt/PostgreSQL/9.2/data/archive/000000010000000100000021': No such file or directory
cp: cannot stat '/opt/PostgreSQL/9.2/data/archive/000000010000000100000021': No such file or directory
cp: cannot stat '/opt/PostgreSQL/9.2/data/archive/00000002.history': No such file or directory
2013-01-11 10:22:06 EST FATAL:  could not connect to the primary server: could not connect to server: No route to host
                Is the server running on host "172.16.0.110" and accepting
                TCP/IP connections on port 5432?

So I gues there are two issues, one is it appears that postgres will not come up unless the IP is in place?? Second is where do you define the user=postgres and password=<password>. Shold that be in the primary_conninfo_opt along with the keepalives? 

Thanks again for your help. it is greatly appreciated.

Keith
________________________________________
From: Takatoshi MATSUO [matsuo.tak at gmail.com]
Sent: Thursday, January 10, 2013 7:20 PM
To: The Pacemaker cluster resource manager
Subject: Re: [Pacemaker] Postresql streaming replication support

Hi Keith

> So the master-ip is the virtual IP?

Yes.
If Master is switched, static IP is changed.
So it needs virtual IP or multiple instance_attributes.
http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/_using_rules_to_control_resource_options.html

Regards,
Takatoshi MATSUO

2013/1/10 Keith Ouellette <Keith.Ouellette at airgas.com>:
> Thank you again, So the master-ip is the virtual IP?  I was thinking that was the IP address of the master database server. Thank you for this. I am going to try this and see if I can get it to function.
>
> Thanks,
> Keith
>
>
> ________________________________________
> From: Takatoshi MATSUO [matsuo.tak at gmail.com]
> Sent: Wednesday, January 09, 2013 11:50 PM
> To: The Pacemaker cluster resource manager
> Subject: Re: [Pacemaker] Postresql streaming replication support
>
> Hi Keith
>
> Do you want to start "ClusterIP" on Slave ?
> It seems that ClusterIP is based on vip-slave of my sample configuration.
>
> If you would like to start it on Master, you need to customize "vip-master".
> And pgsql's master-ip parameter needs the IP of vip-master which
> combines with vip-rep.
>
> like this
> ----------------------------------------------------
> property \
>     no-quorum-policy="ignore" \
>     stonith-enabled="false" \
>     crmd-transition-delay="0s"
>
> rsc_defaults \
>     resource-stickiness="INFINITY" \
>     migration-threshold="1"
>
> ms msPostgresql pgsql \
>     meta \
>         master-max="1" \
>         master-node-max="1" \
>         clone-max="2" \
>         clone-node-max="1" \
>         notify="true"
>
> primitive ClusterIP ocf:heartbeat:IPaddr2 \
>     params \
>         ip="172.16.0.110" \
>         cidr_netmask="24" \
>     op start   timeout="60s" interval="0s"  on-fail="stop" \
>     op monitor timeout="60s" interval="10s" on-fail="restart" \
>     op stop    timeout="60s" interval="0s"  on-fail="block"
>
> primitive pgsql ocf:heartbeat:pgsql \
>     params \
>         pgctl="/opt/PostgreSQL/9.2/bin/pg_ctl" \
>         psql="/opt/PostgreSQL/9.2/bin/psql" \
>         pgdata="/opt/PostgreSQL/9.2/data/" \
>         start_opt="-p 5432" \
>         rep_mode="sync" \
>         node_list="test-db1 test-db2" \
>         restore_command="cp /opt/PostgreSQL/9.2/data/archive/%f %p" \
>         primary_conninfo_opt="keepalives_idle=60 keepalives_interval=5
> keepalives_count=5" \
>         master_ip="172.16.0.110" \
>         stop_escalate="0" \
>     op start   timeout="60s" interval="0s"  on-fail="restart" \
>     op monitor timeout="60s" interval="11s" on-fail="restart" \
>     op monitor timeout="60s" interval="10s"  on-fail="restart" role="Master" \
>     op promote timeout="60s" interval="0s"  on-fail="restart" \
>     op demote  timeout="60s" interval="0s"  on-fail="block" \
>     op stop    timeout="60s" interval="0s"  on-fail="block" \
>     op notify  timeout="60s" interval="0s"
>
> colocation rsc_colocation-1 inf: ClusterIP        msPostgresql:Master
>
> order rsc_order-1 0: msPostgresql:promote  ClusterIP:start   symmetrical=false
> order rsc_order-2 0: msPostgresql:demote   ClusterIP:stop    symmetrical=false
> ----------------------------------------------------
>
> Please see this to operate it.
> https://github.com/t-matsuo/resource-agents/wiki/Operation-examples-for-none-shared-wal-archives-environment
>
> --
> Takatoshi
>
> 2013/1/10 Keith Ouellette <Keith.Ouellette at airgas.com>:
>> Thank you. I wish I could have seen the presentation. In my case, the two nodes are in two different US cities connected by a bridged VPN. I do not have the ability to have separate connection for pacemaker, replication and access.
>>
>>
>>
>> I tried to scale the configuration back to support one VIP, Here is the XML output of what I tried, but the service does not even come up.
>>
>> <resources>
>>       <primitive class="ocf" id="ClusterIP" provider="heartbeat" type="IPaddr2">
>>         <meta_attributes id="ClusterIP-meta_attributes">
>>           <nvpair id="ClusterIP-meta_attributes-target-role" name="target-role" value="Started"/>
>>         </meta_attributes>
>>         <operations id="ClusterIP-operations">
>>           <op id="ClusterIP-op-monitor-10s" interval="10s" name="monitor" timeout="20s"/>
>>         </operations>
>>         <instance_attributes id="ClusterIP-instance_attributes">
>>           <nvpair id="ClusterIP-instance_attributes-ip" name="ip" value="172.16.0.110"/>
>>           <nvpair id="ClusterIP-instance_attributes-cidr_netmask" name="cidr_netmask" value="24"/>
>>         </instance_attributes>
>>       </primitive>
>>       <primitive class="ocf" id="pgsql" provider="heartbeat" type="pgsql">
>>         <meta_attributes id="pgsql-meta_attributes">
>>           <nvpair id="pgsql-meta_attributes-target-role" name="target-role" value="Started"/>
>>         </meta_attributes>
>>         <operations id="pgsql-operations">
>>           <op id="pgsql-op-start-0" interval="0" name="start" on-fail="restart" timeout="120"/>
>>           <op id="pgsql-op-monitor-7" interval="7" name="monitor" on-fail="restart" timeout="60"/>
>>           <op id="pgsql-op-monitor-Master-60" interval="60" name="monitor" on-fail="restart" role="Master" timeout="30"/>
>>           <op id="pgsql-op-promote-0" interval="0" name="promote" on-fail="restart" timeout="60"/>
>>           <op id="pgsql-op-stop-0" interval="0" name="stop" on-fail="block" timeout="60"/>
>>           <op id="pgsql-op-demote-0" interval="0" name="demote" on-fail="block" timeout="60"/>
>>           <op id="pgsql-op-notify-0" interval="0" name="notify" timeout="60"/>
>>         </operations>
>>         <instance_attributes id="pgsql-instance_attributes">
>>           <nvpair id="pgsql-instance_attributes-pgctl" name="pgctl" value="/opt/PostgreSQL/9.2/bin/pg_ctl"/>
>>           <nvpair id="pgsql-instance_attributes-psql" name="psql" value="/opt/PostgreSQL/9.2/bin/psql"/>
>>           <nvpair id="pgsql-instance_attributes-pgdata" name="pgdata" value="/opt/PostgreSQL/9.2/data/"/>
>>           <nvpair id="pgsql-instance_attributes-start_opt" name="start_opt" value="-p 5432"/>
>>           <nvpair id="pgsql-instance_attributes-rep_mode" name="rep_mode" value="sync"/>
>>           <nvpair id="pgsql-instance_attributes-node_list" name="node_list" value="test-db1 test-db2"/>
>>           <nvpair id="pgsql-instance_attributes-restore_command" name="restore_command" value="cp /opt/PostgreSQL/9.2/data/archive/%f %p"/>
>>           <nvpair id="pgsql-instance_attributes-primary_conninfo_opt" name="primary_conninfo_opt" value="keepalives_idle=60 keepalives_interval=5 keepalives_count=5"/>
>>           <nvpair id="pgsql-instance_attributes-master_ip" name="master_ip" value="172.16.0.111"/>
>>           <nvpair id="pgsql-instance_attributes-stop_escalate" name="stop_escalate" value="0"/>
>>         </instance_attributes>
>>       </primitive>
>>     </resources>
>> <constraints>
>>       <rsc_location id="rsc-location-1" rsc="ClusterIP">
>>         <rule id="rsc-location-1-rule" score="200">
>>           <expression attribute="pgsql-status" id="rsc-location-1-rule-pgsql-status" operation="eq" value="HS:sync"/>
>>         </rule>
>>         <rule id="rsc-location-1-rule-0" score="100">
>>           <expression attribute="pgsql-status" id="rsc-location-1-rule-0-pgsql-status" operation="eq" value="PRI"/>
>>         </rule>
>>         <rule id="rsc-location-1-rule-1" score="-INFINITY">
>>           <expression attribute="pgsql-status" id="rsc-location-1-rule-1-pgsql-status" operation="not_defined"/>
>>         </rule>
>>         <rule boolean-op="and" id="rsc-location-1-rule-2" score="-INFINITY">
>>           <expression attribute="pgsql-status" id="rsc-location-1-rule-2-pgsql-status" operation="ne" value="HS:sync"/>
>>           <expression attribute="pgsql-status" id="rsc-location-1-rule-2-pgsql-status-0" operation="ne" value="PRI"/>
>>         </rule>
>>       </rsc_location>
>>       <rsc_location id="rsc-location-2" rsc="pgsql">
>>         <rule id="rsc-location-2-rule" role="Master" score="200">
>>           <expression attribute="#uname" id="rsc-location-2-rule-.uname" operation="eq" value="test-db1"/>
>>         </rule>
>>         <rule id="rsc-location-2-rule-0" role="Master" score="100">
>>           <expression attribute="#uname" id="rsc-location-2-rule-0-.uname" operation="eq" value="test-db2"/>
>>         </rule>
>>         <rule id="rsc-location-2-rule-1" role="Master" score="-INFINITY">
>>           <expression attribute="fail-count-ClusterIP" id="rsc-location-2-rule-1-fail-count-ClusterIP" operation="defined"/>
>>         </rule>
>>         <rule boolean-op="or" id="rsc-location-2-rule-2" score="-INFINITY">
>>           <expression attribute="default_ping_set" id="rsc-location-2-rule-2-default_ping_set" operation="not_defined"/>
>>           <expression attribute="default_ping_set" id="rsc-location-2-rule-2-default_ping_set-0" operation="lt" value="100"/>
>>         </rule>
>>       </rsc_location>
>>       <rsc_colocation id="rsc_colo-1" rsc="pgsql" score="INFINITY" with-rsc="ClusterIP"/>
>>       <rsc_order first="ClusterIP" first-action="start" id="rsc_order_1" symmetrical="false" then="pgsql" then-action="promote"/>
>>     </constraints>
>> Any ideas?
>>
>> Thanks,
>> Keith
>>
>>
>> ________________________________________
>> From: Takatoshi MATSUO [matsuo.tak at gmail.com]
>> Sent: Wednesday, January 09, 2013 8:23 PM
>> To: The Pacemaker cluster resource manager
>> Subject: Re: [Pacemaker] Postresql streaming replication support
>>
>> Hi Keith
>>
>> I don't write detailed documentation.
>> But I spoke about it using this slide.
>>
>>    http://linux-ha.sourceforge.jp/wp/archives/3404
>>
>> Did you read it?
>>
>> I translated it into English.
>> Please click "English version is here : PG-study" to download it.
>> (Sorry for bad translation)
>>
>>
>> Regards,
>> Takatoshi MATSUO
>>
>>
>> 2013/1/10 Keith Ouellette <Keith.Ouellette at airgas.com>
>>>
>>> Jesse,
>>>
>>>
>>>
>>>
>>>
>>>    Thank you for the quick response. Actually yes, the example given at the link below shows several connections between the two (cross-overs).  We only have one connection between the two nodes with a bridged VPN.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> The example has an S-LAN, D-LAN and an IC-LAN. I only have one LAN available to me with a bridge VPN between the two routers. I tried to skinny the configuration to a single virtual IP primitive called "ClusterIP", but I think there are some other dependancies in the example that I am not seeing as it did not work, meaning the resources did not start.  Is there another place I can go that may provide more detailed documentation that may help me with my configuration?
>>>
>>>
>>>
>>>
>>>
>>> Keith
>>>
>>> ________________________________
>>> From: Jesse Hathaway [jesse.hathaway at getbraintree.com]
>>> Sent: Wednesday, January 09, 2013 2:09 PM
>>> To: The Pacemaker cluster resource manager; Jesse Hathaway
>>> Subject: Re: [Pacemaker] Postresql streaming replication support
>>>
>>> On Wed, Jan 9, 2013 at 12:56 PM, Keith Ouellette <Keith.Ouellette at airgas.com> wrote:
>>>>
>>>>   Now I am trying to get that integrated with pacemaker. In my searches, I see that pacemaker with DRBD seems to be the prevelant implementation an most documentation was written around that. However, due to our network conditions, I am looking for information on doing this with WAL. I did find a resource agent that was wirtten for what I am doing (https://github.com/t-matsuo/resource-agents/wiki/Resource-Agent-for-PostgreSQL-9.1-streaming-replication). The issue is the example if complex with the explanation written only toward that example. Is there another place that I can get documentation  on this resource agent? Is there another way that this could be accmoplished that someone has tried?
>>>
>>> Keith we are using the above resource agent in production with streaming replication. Do you have any specific questions about its usage?
>>>
>>> --
>>> Jesse Hathaway, Systems Engineer
>>> Braintree
>>> 917-418-8423
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



More information about the Pacemaker mailing list