[ClusterLabs] Start Timeout.
Ken Gaillot
kgaillot at redhat.com
Wed Nov 14 10:35:59 EST 2018
On Wed, 2018-11-14 at 14:17 +0200, Michael Gaberkorn wrote:
> Hello.
>
>
> I installed ha-cluster with Postgresql-11 with high amount off data
> (5-9 Tb).
>
> After clone data to slave node got:
>
> pcs status
> ===========
> Master/Slave Set: pgsql-ha [pgsqld]
> pgsqld (ocf::heartbeat:pgsqlms): FAILED cluster2
> (blocked)
> Masters: [ cluster1 ]
> pgsql-master-ip (ocf::heartbeat:IPaddr2): Started
> cluster1
>
> Failed Actions:
> * pgsqld_stop_0 on cluster2 'unknown error' (1): call=329,
> status=complete, exitreason='Unexpected state for instance "pgsqld"
> (returned 1)',
This isn't a timeout; the resource agent is returning with an error
state, before the timeout has expired. Also, it's a stop failure, not a
start failure.
I'm not familiar with the pgsql agents so I can't help there. Hopefully
someone else can chime in.
> last-rc-change='Wed Nov 14 14:04:48 2018', queued=0ms, exec=301ms
> =========================
>
> Look like Pacemaker can’t wait while Postgres do a start check.
>
> which parameter controls waiting time for start service?
>
> I set so:
> =============
> <operations>
> <op id="pgsqld-demote-interval-0s" interval="0s"
> name="demote" timeout="120s"/>
> <op id="pgsqld-methods-interval-0s" interval="0s"
> name="methods" timeout="60"/>
> <op id="pgsqld-monitor-interval-15s" interval="15s"
> name="monitor" role="Master" timeout="10s"/>
> <op id="pgsqld-monitor-interval-16s" interval="16s"
> name="monitor" role="Slave" timeout="10s"/>
> <op id="pgsqld-notify-interval-0s" interval="0s"
> name="notify" timeout="60s"/>
> <op id="pgsqld-promote-interval-0s" interval="0s"
> name="promote" timeout="40s"/>
> <op id="pgsqld-reload-interval-0s" interval="0s"
> name="reload" timeout="20"/>
> <op id="pgsqld-start-interval-0s" interval="0s"
> name="start" on-fail="ignore" timeout="12000s"/>
The above is the correct place to set the start timeout.
> <op id="pgsqld-stop-interval-0s" interval="0s"
> name="stop" timeout="120s"/>
> </operations>
> ============================
> but it didn't affect :(
>
> which parameter controls waiting time for start service?
>
> Thank you.
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.
> pdf
> Bugs: http://bugs.clusterlabs.org
--
Ken Gaillot <kgaillot at redhat.com>
More information about the Users
mailing list