[ClusterLabs] PostgreSQL PAF failover issue
    Jehan-Guillaume de Rorthais 
    jgdr at dalibo.com
       
    Fri Jun 14 10:53:53 EDT 2019
    
    
  
On Fri, 14 Jun 2019 16:43:23 +0200
Tiemen Ruiten <t.ruiten at rdmedia.com> wrote:
> Right, so I may have been too fast to give up. I set maintenance mode back
> on and promoted ph-sql-04 manually. Unfortunately I don't have the logs of
> ph-sql-03 anymore because I reinitialized it.
> You mention that demote timeout should be start timeout + stop timeout.
> Start/stop are 60s, so that would mean 120s for demote timeout? Or 30s for
> start/stop?
Considering your slow checkpoint, go high until you fixed it.
  demote=120s
  start/stop=60s
  notify=60s
Another good practice would be to setup a centralized log server using
eg rsyslog. It avoid loosing messages during fencing and you can gather all
the logs from all the nodes in one place. See the vagrant files and scripts in
PAF/test/ repository for a demo setup.
    
    
More information about the Users
mailing list