[ClusterLabs] Pacemaker: pgsql

Ken Gaillot kgaillot at redhat.com
Fri Sep 27 13:14:09 EDT 2019


On Fri, 2019-09-27 at 19:03 +0530, Shital A wrote:
> 
> 
> On Tue, 24 Sep 2019, 22:20 Shital A, <brightuser2019 at gmail.com>
> wrote:
> > Hello,
> > 
> > We have setup active-passive cluster using streaming replication on
> > Rhel7.5. We are testing pacemaker for automated failover.
> > We are seeing below issues with the setup :
> > 
> > 1. When a failover is triggered when data is being added to the
> > primary by killing primary (killall -9 postgres), the standby
> > doesnt come up in sync.
> > On pacemaker, the crm_mon -Afr shows standby in disconnected and
> > HS:alone state.
> > 
> > On postgres, we see below error:
> > 
> > < 2019-09-20 17:07:46.266 IST > LOG:  entering standby mode
> > < 2019-09-20 17:07:46.267 IST > LOG:  database system was not
> > properly shut down; automatic recovery in progress
> > < 2019-09-20 17:07:46.270 IST > LOG:  redo starts at 1/680A2188
> > < 2019-09-20 17:07:46.370 IST > LOG:  consistent recovery state
> > reached at 1/6879D9F8
> > < 2019-09-20 17:07:46.370 IST > LOG:  database system is ready to
> > accept read only connections
> > cp: cannot stat
> > '/var/lib/pgsql/9.6/data/archivedir/000000010000000100000068': No
> > such file or directory
> > < 2019-09-20 17:07:46.751 IST > LOG:  statement: select
> > pg_is_in_recovery()
> > < 2019-09-20 17:07:46.782 IST > LOG:  statement: show
> > synchronous_standby_names
> > < 2019-09-20 17:07:50.993 IST > LOG:  statement: select
> > pg_is_in_recovery()
> > < 2019-09-20 17:07:53.395 IST > LOG:  started streaming WAL from
> > primary at 1/68000000 on timeline 1
> > < 2019-09-20 17:07:53.436 IST > LOG:  invalid contrecord length
> > 2662 at 1/6879D9F8
> > < 2019-09-20 17:07:53.438 IST > FATAL:  terminating walreceiver
> > process due to administrator command
> > cp: cannot stat
> > '/var/lib/pgsql/9.6/data/archivedir/00000002.history': No such file
> > or directory
> > cp: cannot stat
> > '/var/lib/pgsql/9.6/data/archivedir/000000010000000100000068': No
> > such file or directory
> > 
> > When we try to restart postgres on the standby, using pg_ctl
> > restart, the standby start syncing.
> > 
> > 
> > 2. After standby syncs using pg_ctl restart as mentioned above, we
> > found out that 1-2 records are missing on the standby.
> > 
> > Need help to check:
> > 1. why the standby starts in disconnect, HS:alone state? 
> > 
> > f you have faced this issue/have knowledge, please let us know.
> > 
> > Thanks.
> 
> 
> Hello,
> 
> I didn't  receive any reply on this issue.wondering whether there are
> no opinions or whether pacemaker with pgsql is not recommended?.
> 
> 
> Thanks! 

Hi,

There are quite a few pacemaker+pgsql users active on this list, but
they may not have time to respond at the moment. Most are using the PAF
agent rather than the pgsql agent (see 
https://github.com/ClusterLabs/PAF ).
-- 
Ken Gaillot <kgaillot at redhat.com>



More information about the Users mailing list