[ClusterLabs] Antw: [EXT] Postgres Cluster PAF problems

damiano giuliani damianogiuliani87 at gmail.com
Wed Jun 30 08:36:29 EDT 2021


Hi Guys,

thanks for the support, really hoped you were not in holydays yet!

the replication is async, having a look into the postgres logs seems some
updates failed cuz no master available.
i dont expect resource problems (im investingating ayway), the nodes have
200gb RAM , 80 cpu and alot of free hdd space.

how you guys suggest me to find out why the monitor timed out?

Really thanks for your support.

Pepe




Il giorno mer 30 giu 2021 alle ore 14:17 Ulrich Windl <
Ulrich.Windl at rz.uni-regensburg.de> ha scritto:

> >>> damiano giuliani <damianogiuliani87 at gmail.com> schrieb am 30.06.2021
> um 13:44
> in Nachricht
> <CAG=zYNNe=azZaLEhe3JzKaHnSEv88Nr+yEo0m06hLjL4L11PCA at mail.gmail.com>:
> > Hi Guys,
> >
> > sorry for bothering, unfortunally i was called for an issue related to a
> > cluster i did months ago which was fully functional till last saturday.
> >
> > looks some applications lost connection to the master losing some
> > update/insert.
> >
> > i found the cause into the logs, the psqld-monitor went timeout after
> > 10000ms and the master resource been demote, the instance stopped and
> then
> > promoted to master again, generating few seconds of disservices (no
> master
> > during the described process)
>
> Well, I think YOU have to find out why the monitor timed out. Maybe the
> disks being used were too busy, maybe the memory was tight, ...
> WE don't know.
>
> >
> > i noticed a redundant info:
> > Update score of "ltaoperdbsXX" from 990 to 1000 because of a change in
> the
> > replication lag
> > seems some kind of network lag?
> >
> > the network should be 10gbs where both corosync and prod network insist.
> > netkwork bonding on all of the nodes.
> > PAF version resource-agents-paf-2.3.0-1.rhel7.noarch
> > Postgres psql (13.1)
> > pacemaker-1.1.23-1.el7.x86_64
> > pcs-0.9.169-3.el7.centos.x86_64
> >
> > i attached the log could be useful to dig further.
> > Can some guys point me on the right direction, should be really
> appreciate.
> >
> > thanks for the support
> > Pepe
>
>
>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20210630/1b4bc70e/attachment.htm>


More information about the Users mailing list