[ClusterLabs] Replicated PGSQL woes

Thu Oct 13 18:11:06 EDT 2016

On Oct 13, 2016, at 1:56 PM, Jehan-Guillaume de Rorthais <jgdr at dalibo.com> wrote:
> 
> On Thu, 13 Oct 2016 10:05:33 -0800
> Israel Brewster <israel at ravnalaska.net> wrote:
> 
>> On Oct 13, 2016, at 9:41 AM, Ken Gaillot <kgaillot at redhat.com> wrote:
>>> 
>>> On 10/13/2016 12:04 PM, Israel Brewster wrote:  
> [...]
> 
>>>> But whatever- this is a cluster, it doesn't really matter which node
>>>> things are running on, as long as they are running. So the cluster is
>>>> working - postgresql starts, the master process is on the same node as
>>>> the IP, you can connect, etc, everything looks good. Obviously the next
>>>> thing to try is failover - should the master node fail, the slave node
>>>> should be promoted to master. So I try testing this by shutting down the
>>>> cluster on the primary server: "pcs cluster stop"
>>>> ...and nothing happens. The master shuts down (uncleanly, I might add -
>>>> it leaves behind a lock file that prevents it from starting again until
>>>> I manually remove said lock file), but the slave is never promoted to  
>>> 
>>> This definitely needs to be corrected. What creates the lock file, and
>>> how is that entity managed?  
>> 
>> The lock file entity is created/managed by the postgresql process itself. On
>> launch, postgres creates the lock file to say it is running, and deletes said
>> lock file when it shuts down. To my understanding, its role in life is to
>> prevent a restart after an unclean shutdown so the admin is reminded to make
>> sure that the data is in a consistent state before starting the server again.
> 
> What is the name of this lock file? Where is it?
> 
> PostgreSQL does not create lock file. It creates a "postmaster.pid" file, but
> it does not forbid a startup if the new process doesn't find another process
> with the pid and shm shown in the postmaster.pid.
> 
> As far as I know, the pgsql resource agent create such a lock file on promote
> and delete it on graceful stop. If the PostgreSQL instance couldn't be stopped
> correctly, the lock files stays and the RA refuse to start it the next time.

Ah, you're right. Looking auth the RA I see where it creates the file in question. The delete appears to be in the pgsql_real_stop() function (which makes sense), wrapped in an if block that checks for $1 being master and $OCF_RESKEY_CRM_meta_notify_slave_uname being a space. Throwing a little debugging code in there I see that when it hits that block on a cluster stop, $OCF_RESKEY_CRM_meta_notify_slave_uname is centtest1.ravnalaska.net <http://centtest1.ravnalaska.net/>, not a space, so the lock file is not removed:

    if  [ "$1" = "master" -a "$OCF_RESKEY_CRM_meta_notify_slave_uname" = " " ]; then
        ocf_log info "Removing $PGSQL_LOCK."
        rm -f $PGSQL_LOCK
    fi 

It doesn't look like there is anywhere else where the file would be removed.

> 
> [...]
>>>> What can I do to fix this? What troubleshooting steps can I follow? Thanks.
> 
> I can not find the result of the stop operation in your log files, maybe the
> log from CentTest2 would be more useful.

Sure. I was looking at centtest1 because I was trying to figure out why it wouldn't promote, but if centtest2 never really stopped (properly) that could explain things. Here's the log from 2 when calling pcs cluster stop:

Oct 13 14:05:14 CentTest2 attrd[9424]:   notice: Sending flush op to all hosts for: standby (true)
Oct 13 14:05:14 CentTest2 attrd[9424]:   notice: Sent update 26: standby=true
Oct 13 14:05:14 CentTest2 pacemaker: Waiting for shutdown of managed resources
Oct 13 14:05:14 CentTest2 crmd[9426]:   notice: Operation pgsql_96_notify_0: ok (node=centtest2.ravnalaska.net, call=21, rc=0, cib-update=0, confirmed=true)
Oct 13 14:05:14 CentTest2 attrd[9424]:   notice: Sending flush op to all hosts for: master-pgsql_96 (-INFINITY)
Oct 13 14:05:14 CentTest2 attrd[9424]:   notice: Sent update 28: master-pgsql_96=-INFINITY
Oct 13 14:05:14 CentTest2 attrd[9424]:   notice: Sending flush op to all hosts for: pgsql_96-master-baseline (<null>)
Oct 13 14:05:14 CentTest2 attrd[9424]:   notice: Sent delete 30: node=centtest2.ravnalaska.net, attr=pgsql_96-master-baseline, id=<n/a>, set=(null), section=status
Oct 13 14:05:14 CentTest2 attrd[9424]:   notice: Sent delete 32: node=centtest2.ravnalaska.net, attr=pgsql_96-master-baseline, id=<n/a>, set=(null), section=status
Oct 13 14:05:14 CentTest2 pgsql(pgsql_96)[5107]: INFO: Stopping PostgreSQL on demote.
Oct 13 14:05:14 CentTest2 pgsql(pgsql_96)[5107]: INFO: stop_escalate(or stop_escalate_in_slave) time is adjusted to 50 based on the configured timeout.
Oct 13 14:05:14 CentTest2 pgsql(pgsql_96)[5107]: INFO: server shutting down
Oct 13 14:05:15 CentTest2 pgsql(pgsql_96)[5107]: INFO: PostgreSQL is down
Oct 13 14:05:15 CentTest2 pgsql(pgsql_96)[5107]: INFO: Changing pgsql_96-status on centtest2.ravnalaska.net : PRI->STOP.
Oct 13 14:05:15 CentTest2 attrd[9424]:   notice: Sending flush op to all hosts for: pgsql_96-status (STOP)
Oct 13 14:05:15 CentTest2 attrd[9424]:   notice: Sent update 34: pgsql_96-status=STOP
Oct 13 14:05:15 CentTest2 crmd[9426]:   notice: Operation pgsql_96_demote_0: ok (node=centtest2.ravnalaska.net, call=23, rc=0, cib-update=20, confirmed=true)
Oct 13 14:05:15 CentTest2 crmd[9426]:   notice: Operation pgsql_96_notify_0: ok (node=centtest2.ravnalaska.net, call=24, rc=0, cib-update=0, confirmed=true)
Oct 13 14:05:15 CentTest2 IPaddr2(virtual_ip)[5360]: INFO: IP status = ok, IP_CIP=
Oct 13 14:05:15 CentTest2 crmd[9426]:   notice: Operation virtual_ip_stop_0: ok (node=centtest2.ravnalaska.net, call=26, rc=0, cib-update=21, confirmed=true)
Oct 13 14:05:15 CentTest2 crmd[9426]:   notice: Operation pgsql_96_notify_0: ok (node=centtest2.ravnalaska.net, call=27, rc=0, cib-update=0, confirmed=true)
Oct 13 14:05:15 CentTest2 pgsql(pgsql_96)[5497]: INFO: PostgreSQL is already stopped.
Oct 13 14:05:15 CentTest2 crmd[9426]:   notice: Operation pgsql_96_stop_0: ok (node=centtest2.ravnalaska.net, call=28, rc=0, cib-update=22, confirmed=true)
Oct 13 14:05:16 CentTest2 pacemaker: Leaving fence domain
Oct 13 14:05:16 CentTest2 pacemaker: Stopping fenced 9227
Oct 13 14:05:16 CentTest2 pacemaker: Signaling Pacemaker Cluster Manager to terminate
Oct 13 14:05:16 CentTest2 pacemakerd[9415]:   notice: Invoking handler for signal 15: Terminated
Oct 13 14:05:16 CentTest2 pacemakerd[9415]:   notice: Shutting down Pacemaker
Oct 13 14:05:16 CentTest2 pacemaker: Waiting for cluster services to unload
Oct 13 14:05:16 CentTest2 pacemakerd[9415]:   notice: Stopping crmd: Sent -15 to process 9426
Oct 13 14:05:16 CentTest2 crmd[9426]:   notice: Invoking handler for signal 15: Terminated
Oct 13 14:05:16 CentTest2 crmd[9426]:   notice: Requesting shutdown, upper limit is 1200000ms
Oct 13 14:05:16 CentTest2 attrd[9424]:   notice: Update relayed from centtest1.ravnalaska.net
Oct 13 14:05:16 CentTest2 attrd[9424]:   notice: Sending flush op to all hosts for: shutdown (1476396316)
Oct 13 14:05:16 CentTest2 attrd[9424]:   notice: Sent update 37: shutdown=1476396316
Oct 13 14:05:16 CentTest2 crmd[9426]:   notice: State transition S_NOT_DC -> S_STOPPING [ input=I_STOP cause=C_HA_MESSAGE origin=route_message ]
Oct 13 14:05:16 CentTest2 crmd[9426]:   notice: Stopped 0 recurring operations at shutdown... waiting (0 ops remaining)
Oct 13 14:05:16 CentTest2 crmd[9426]:   notice: Disconnected from the LRM
Oct 13 14:05:16 CentTest2 crmd[9426]:   notice: Disconnecting from Corosync
Oct 13 14:05:16 CentTest2 pengine[9425]:   notice: Invoking handler for signal 15: Terminated
Oct 13 14:05:16 CentTest2 pacemakerd[9415]:   notice: Stopping pengine: Sent -15 to process 9425
Oct 13 14:05:16 CentTest2 pacemakerd[9415]:   notice: Stopping attrd: Sent -15 to process 9424
Oct 13 14:05:16 CentTest2 attrd[9424]:   notice: Invoking handler for signal 15: Terminated
Oct 13 14:05:16 CentTest2 attrd[9424]:   notice: Exiting...
Oct 13 14:05:16 CentTest2 pacemakerd[9415]:   notice: Stopping lrmd: Sent -15 to process 9423
Oct 13 14:05:16 CentTest2 lrmd[9423]:   notice: Invoking handler for signal 15: Terminated
Oct 13 14:05:16 CentTest2 pacemakerd[9415]:   notice: Stopping stonith-ng: Sent -15 to process 9422
Oct 13 14:05:16 CentTest2 stonith-ng[9422]:   notice: Invoking handler for signal 15: Terminated
Oct 13 14:05:16 CentTest2 pacemakerd[9415]:   notice: Stopping cib: Sent -15 to process 9421
Oct 13 14:05:16 CentTest2 cib[9421]:  warning: new_event_notification (9421-9422-12): Broken pipe (32)
Oct 13 14:05:16 CentTest2 cib[9421]:  warning: Notification of client crmd/efe7e6d2-3a51-4e0d-a4c5-cb986a8711d8 failed
Oct 13 14:05:16 CentTest2 cib[9421]:   notice: Invoking handler for signal 15: Terminated
Oct 13 14:05:16 CentTest2 cib[9421]:   notice: Disconnecting from Corosync
Oct 13 14:05:16 CentTest2 cib[9421]:   notice: Disconnecting from Corosync
Oct 13 14:05:16 CentTest2 pacemakerd[9415]:   notice: Shutdown complete
Oct 13 14:05:17 CentTest2 kernel: dlm: closing connection to node 2
Oct 13 14:05:17 CentTest2 kernel: dlm: closing connection to node 1
Oct 13 14:05:18 CentTest2 corosync[9163]:   [SERV  ] Unloading all Corosync service engines.
Oct 13 14:05:18 CentTest2 corosync[9163]:   [SERV  ] Service engine unloaded: corosync extended virtual synchrony service
Oct 13 14:05:18 CentTest2 corosync[9163]:   [SERV  ] Service engine unloaded: corosync configuration service
Oct 13 14:05:18 CentTest2 corosync[9163]:   [SERV  ] Service engine unloaded: corosync cluster closed process group service v1.01
Oct 13 14:05:18 CentTest2 corosync[9163]:   [SERV  ] Service engine unloaded: corosync cluster config database access v1.01
Oct 13 14:05:18 CentTest2 corosync[9163]:   [SERV  ] Service engine unloaded: corosync profile loading service
Oct 13 14:05:18 CentTest2 corosync[9163]:   [SERV  ] Service engine unloaded: openais checkpoint service B.01.01
Oct 13 14:05:18 CentTest2 corosync[9163]:   [SERV  ] Service engine unloaded: corosync CMAN membership service 2.90
Oct 13 14:05:18 CentTest2 corosync[9163]:   [SERV  ] Service engine unloaded: corosync cluster quorum service v0.1
Oct 13 14:05:18 CentTest2 corosync[9163]:   [MAIN  ] Corosync Cluster Engine exiting with status 0 at main.c:2055.

> but I can find this:
> 
>  Oct 13 08:29:41 CentTest1 pengine[30095]:   notice: Scheduling Node
>  centtest2.ravnalaska.net for shutdown
>  ...
>  Oct 13 08:29:41 CentTest1 pengine[30095]:   notice: Scheduling Node
>  centtest2.ravnalaska.net for shutdown
> 
> Which means the stop operation probably raised an error, leading to a fencing
> of the node. In this circumstance, I bet PostgreSQL wasn't able to stop
> correctly and the lock file stayed in place.
> 
> Could you please show us your full cluster setup?

Sure: how? pcs status shows this, but I suspect that's not what you are asking about:

Cluster name: cluster_test
Last updated: Thu Oct 13 14:09:29 2016		Last change: Thu Oct 13 12:26:55 2016 by root via crmd on centtest1.ravnalaska.net
Stack: cman
Current DC: centtest1.ravnalaska.net (version 1.1.14-8.el6_8.1-70404b0) - partition with quorum
2 nodes and 3 resources configured

Online: [ centtest1.ravnalaska.net centtest2.ravnalaska.net ]

Full list of resources:

 virtual_ip	(ocf::heartbeat:IPaddr2):	Started centtest2.ravnalaska.net
 Master/Slave Set: msPostgresql [pgsql_96]
     Masters: [ centtest2.ravnalaska.net ]
     Slaves: [ centtest1.ravnalaska.net ]

PCSD Status:
  centtest1.ravnalaska.net: Online
  centtest2.ravnalaska.net: Online

> 
> By the way, did you had a look to the PAF project? 
> 
>  http://dalibo.github.io/PAF/
>  http://dalibo.github.io/PAF/documentation.html
> 
> The v1.1 version for EL6 is not ready yet, but you might want to give it a
> try: https://github.com/dalibo/PAF/tree/v1.1
> 
> I would recommend EL7 and PAF 2.0, published, packaged, ready to use.

First I've heard of it. I'll give it a look.

Thanks!

> Regards,
> 
> -- 
> Jehan-Guillaume (ioguix) de Rorthais
> Dalibo

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20161013/d2965da5/attachment-0003.html>