[ClusterLabs] Master/slave failover does not work as expected

Mon Aug 12 16:09:31 EDT 2019

On Mon, Aug 12, 2019 at 4:12 PM Michael Powell <
Michael.Powell at harmonicinc.com> wrote:

> At 07:44:49, the ss agent discovers that the master instance has failed on
> node *mgraid…-0* as a result of a failed *ssadm* request in response to
> an *ss_monitor()* operation.  It issues a *crm_master -Q -D* command with
> the intent of demoting the master and promoting the slave, on the other
> node, to master.  The *ss_demote()* function finds that the application
> is no longer running and returns *OCF_NOT_RUNNING* (7).  In the older
> product, this was sufficient to promote the other instance to master, but
> in the current product, that does not happen.  Currently, the failed
> application is restarted, as expected, and is promoted to master, but this
> takes 10’s of seconds.
>
>
>

Did you try to disable resource stickiness for this ms?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20190812/12978d55/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.gif
Type: image/gif
Size: 1854 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20190812/12978d55/attachment-0001.gif>