[ClusterLabs] Start Timeout.

Michael Gaberkorn michaelg at bd-innovations.com
Wed Nov 14 07:17:38 EST 2018


I installed  ha-cluster with Postgresql-11 with high amount off data (5-9 Tb). 

After clone data to slave node got:

pcs status
 Master/Slave Set: pgsql-ha [pgsqld]
     pgsqld     (ocf::heartbeat:pgsqlms):       FAILED cluster2 (blocked)
     Masters: [ cluster1 ]
 pgsql-master-ip        (ocf::heartbeat:IPaddr2):       Started cluster1

Failed Actions:
* pgsqld_stop_0 on cluster2 'unknown error' (1): call=329, status=complete, exitreason='Unexpected state for instance "pgsqld" (returned 1)',
    last-rc-change='Wed Nov 14 14:04:48 2018', queued=0ms, exec=301ms

Look like Pacemaker can’t wait while Postgres do a start check. 

which parameter controls waiting time for start service?

I set so: 
            <op id="pgsqld-demote-interval-0s" interval="0s" name="demote" timeout="120s"/>
            <op id="pgsqld-methods-interval-0s" interval="0s" name="methods" timeout="60"/>
            <op id="pgsqld-monitor-interval-15s" interval="15s" name="monitor" role="Master" timeout="10s"/>
            <op id="pgsqld-monitor-interval-16s" interval="16s" name="monitor" role="Slave" timeout="10s"/>
            <op id="pgsqld-notify-interval-0s" interval="0s" name="notify" timeout="60s"/>
            <op id="pgsqld-promote-interval-0s" interval="0s" name="promote" timeout="40s"/>
            <op id="pgsqld-reload-interval-0s" interval="0s" name="reload" timeout="20"/>
            <op id="pgsqld-start-interval-0s" interval="0s" name="start" on-fail="ignore" timeout="12000s"/>
            <op id="pgsqld-stop-interval-0s" interval="0s" name="stop" timeout="120s"/>
but it didn't affect :(

which parameter controls waiting time for start service?

Thank you. 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20181114/9d762895/attachment.html>

More information about the Users mailing list