[ClusterLabs] Fwd: Postgres pacemaker cluster failure
Jehan-Guillaume de Rorthais
jgdr at dalibo.com
Wed Jul 10 05:42:20 EDT 2019
On Tue, 9 Jul 2019 19:57:06 +0300
Andrei Borzenkov <arvidjaar at gmail.com> wrote:
> 09.07.2019 13:08, Danka Ivanović пишет:
> > Hi I didn't manage to start master with postgres, even if I increased start
> > timeout. I checked executable paths and start options.
We would require much more logs from this failure...
> > When cluster is running with manually started master and slave started over
> > pacemaker, everything works ok.
Logs from this scenario might be interesting as well to check and compare.
> > Today we had failover again.
> > I cannot find reason from the logs, can you help me with debugging? Thanks.
logs logs logs please.
> > Jul 09 09:16:32  postgres1 lrmd: debug:
> > child_kill_helper: Kill pid 12735's group Jul 09 09:16:34 
> > postgres1 lrmd: warning: child_timeout_callback:
> > PGSQL_monitor_15000 process (PID 12735) timed out
> You probably want to enable debug output in resource agent. As far as I
> can tell, this requires HA_debug=1 in environment of resource agent, but
> for the life of me I cannot find where it is possible to set it.
> Probably setting it directly in resource agent for debugging is the most
> simple way.
I usually set this in "/etc/sysconfig/pacemaker". Never tried to add it
to pgsqlms, interesting.
> P.S. crm_resource is called by resource agent (pgsqlms). And it shows
> result of original resource probing which makes it confusing. At least
> it explains where these logs entries come from.
Not sure tu understand what you mean :/
More information about the Users