[Pacemaker] Postgresql Replication

Eloy Coto Pereiro eloy.coto at gmail.com
Thu Sep 12 04:49:57 EDT 2013


Hi,

I have issues with this config, for example when master is running corosync
service use pg_ctl. But in the slave pg_ctl doesn't start and replication
doesn't work.

This is my data:


Online: [ master slave ]
OFFLINE: [ ]

Full list of resources:

ClusterIP (ocf::heartbeat:IPaddr2): Started master
KAMAILIO        (lsb:kamailio): Started master
 Master/Slave Set: msPostgresql [pgsql]
     Masters: [ master ]
     Stopped: [ pgsql:1 ]

Node Attributes:
* Node master:
    + maintenance                       : off
    + master-pgsql                      : 1000
    + pgsql-data-status                 : LATEST
    + pgsql-master-baseline             : 0000000019000080
    + pgsql-status                      : PRI
* Node slave:
    + pgsql-data-status                 : DISCONNECT
    + pgsql-status                      : HS:sync


In my crm configure show is this:
node master \
attributes maintenance="off" pgsql-data-status="LATEST"
node slave \
attributes pgsql-data-status="DISCONNECT"
primitive ClusterIP ocf:heartbeat:IPaddr2 \
params ip="10.1.1.1" cidr_netmask="24" \
op monitor interval="15s" \
op start timeout="60s" interval="0s" on-fail="stop" \
op monitor timeout="60s" interval="10s" on-fail="restart" \
op stop timeout="60s" interval="0s" on-fail="block"
primitive KAMAILIO lsb:kamailio \
op monitor interval="10s" \
op start interval="0" timeout="120s" \
op stop interval="0" timeout="120s" \
meta target-role="Started"
primitive pgsql ocf:heartbeat:pgsql \
params pgctl="/usr/pgsql-9.2/bin/pg_ctl" psql="/usr/pgsql-9.2/bin/psql"
pgdata="/var/lib/pgsql/9.2/data/" rep_mode="sync" node_list="master slave"
restore_command="cp /var/lib/pgsql/9.2/pg_archive/%f %p"
primary_conninfo_opt="keepalives_idle=60 keepalives_interval=5
keepalives_count=5" master_ip="10.1.1.1" restart_on_promote="true" \
op start timeout="60s" interval="0s" on-fail="restart" \
op monitor timeout="60s" interval="4s" on-fail="restart" \
op monitor timeout="60s" interval="3s" on-fail="restart" role="Master" \
op promote timeout="60s" interval="0s" on-fail="restart" \
op demote timeout="60s" interval="0s" on-fail="stop" \
op stop timeout="60s" interval="0s" on-fail="block" \
op notify timeout="60s" interval="0s"
ms msPostgresql pgsql \
meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1"
notify="true" target-role="Started"
location cli-prefer-KAMAILIO KAMAILIO \
rule $id="cli-prefer-rule-KAMAILIO" inf: #uname eq master
location cli-prefer-pgsql msPostgresql \
rule $id="cli-prefer-rule-pgsql" inf: #uname eq master
location cli-standby-ClusterIP ClusterIP \
rule $id="cli-standby-rule-ClusterIP" -inf: #uname eq slave
colocation colocation-1 inf: ClusterIP msPostgresql KAMAILIO
order order-1 inf: ClusterIP msPostgresql KAMAILIO
property $id="cib-bootstrap-options" \
dc-version="1.1.8-7.el6-394e906" \
cluster-infrastructure="classic openais (with plugin)" \
expected-quorum-votes="2" \
stonith-enabled="false"

Any idea why doesn't start on the second slave?

More info:

Master:

root at master ~]# netstat -putan | grep 5432 | grep LISTEN
tcp        0      0 0.0.0.0:5432                0.0.0.0:*
LISTEN      3241/postgres
tcp        0      0 :::5432                     :::*
 LISTEN      3241/postgres
[root at master ~]# ps axu | grep postgres
postgres  3241  0.0  0.0  97072  7692 ?        S    11:41   0:00
/usr/pgsql-9.2/bin/postgres -D /var/lib/pgsql/9.2/data -c
config_file=/var/lib/pgsql/9.2/data//postgresql.conf
postgres  3293  0.0  0.0  97072  1556 ?        Ss   11:41   0:00 postgres:
checkpointer process
postgres  3294  0.0  0.0  97072  1600 ?        Ss   11:41   0:00 postgres:
writer process
postgres  3295  0.0  0.0  97072  1516 ?        Ss   11:41   0:00 postgres:
wal writer process
postgres  3296  0.0  0.0  97920  2760 ?        Ss   11:41   0:00 postgres:
autovacuum launcher process
postgres  3297  0.0  0.0  82712  1500 ?        Ss   11:41   0:00 postgres:
archiver process   failed on 000000010000000000000001
postgres  3298  0.0  0.0  82872  1568 ?        Ss   11:41   0:00 postgres:
stats collector process
root     10901  0.0  0.0 103232   852 pts/0    S+   11:44   0:00 grep
postgres


On slave:

[root at slave ~]# ps axu | grep postgre
root      3332  0.0  0.0 103232   856 pts/0    S+   11:45   0:00 grep
postgre
[root at slave ~]# netstat -putan | grep 5432
[root at slave ~]#


If I make pg_ctl /var/lib/pgsql/9.2/data/ start work ok

Any idea?


2013/9/11 Takatoshi MATSUO <matsuo.tak at gmail.com>

> Hi Eloy
>
> Please see http://clusterlabs.org/wiki/PgSQL_Replicated_Cluster .
> In the document, it uses virtual IP to receive connection,
> so it doesn't need to change recovery.conf.
>
> Thanks,
> Takatoshi MATSUO
>
>
> 2013/9/11 Eloy Coto Pereiro <eloy.coto at gmail.com>:
> > Hi,
> >
> > In Postgresql if you use wal replication
> > <http://wiki.postgresql.org/wiki/Streaming_Replication> when the master
> > servers fails need to change the recovery.conf on the slave server.
> >
> > In this case any tool, when the master is down, execute a command and get
> > this info?
> > Is this the right tool for postgresql's replication?
> >
> > Cheers
> >
> >
> >
> >
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> >
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130912/1464fcb0/attachment-0003.html>


More information about the Pacemaker mailing list