[ClusterLabs] Start Timeout.

Ken Gaillot kgaillot at redhat.com
Wed Nov 14 10:35:59 EST 2018


On Wed, 2018-11-14 at 14:17 +0200, Michael Gaberkorn wrote:
> Hello.
> 
> 
> I installed  ha-cluster with Postgresql-11 with high amount off data
> (5-9 Tb). 
> 
> After clone data to slave node got:
> 
> pcs status
> ===========
>  Master/Slave Set: pgsql-ha [pgsqld]
>      pgsqld     (ocf::heartbeat:pgsqlms):       FAILED cluster2
> (blocked)
>      Masters: [ cluster1 ]
>  pgsql-master-ip        (ocf::heartbeat:IPaddr2):       Started
> cluster1
> 
> Failed Actions:
> * pgsqld_stop_0 on cluster2 'unknown error' (1): call=329,
> status=complete, exitreason='Unexpected state for instance "pgsqld"
> (returned 1)',

This isn't a timeout; the resource agent is returning with an error
state, before the timeout has expired. Also, it's a stop failure, not a
start failure.

I'm not familiar with the pgsql agents so I can't help there. Hopefully
someone else can chime in.

>     last-rc-change='Wed Nov 14 14:04:48 2018', queued=0ms, exec=301ms
> =========================
> 
> Look like Pacemaker can’t wait while Postgres do a start check. 
> 
> which parameter controls waiting time for start service?
> 
> I set so: 
> =============
>       <operations>
>             <op id="pgsqld-demote-interval-0s" interval="0s"
> name="demote" timeout="120s"/>
>             <op id="pgsqld-methods-interval-0s" interval="0s"
> name="methods" timeout="60"/>
>             <op id="pgsqld-monitor-interval-15s" interval="15s"
> name="monitor" role="Master" timeout="10s"/>
>             <op id="pgsqld-monitor-interval-16s" interval="16s"
> name="monitor" role="Slave" timeout="10s"/>
>             <op id="pgsqld-notify-interval-0s" interval="0s"
> name="notify" timeout="60s"/>
>             <op id="pgsqld-promote-interval-0s" interval="0s"
> name="promote" timeout="40s"/>
>             <op id="pgsqld-reload-interval-0s" interval="0s"
> name="reload" timeout="20"/>
>             <op id="pgsqld-start-interval-0s" interval="0s"
> name="start" on-fail="ignore" timeout="12000s"/>

The above is the correct place to set the start timeout.

>             <op id="pgsqld-stop-interval-0s" interval="0s"
> name="stop" timeout="120s"/>
>           </operations>
> ============================
> but it didn't affect :(
> 
> which parameter controls waiting time for start service?
> 
> Thank you. 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.
> pdf
> Bugs: http://bugs.clusterlabs.org
-- 
Ken Gaillot <kgaillot at redhat.com>



More information about the Users mailing list