[ClusterLabs Developers] migrate-to and migrate-from for moving Master/Slave roles ?
Jehan-Guillaume de Rorthais
jgdr at dalibo.com
Fri Dec 11 07:40:05 UTC 2015
(Sorry for the top post)
Thank you all for your time, answers and advices. They are much appreciated.
I have no bandwidth right now to process your inputs, for some days / weeks
(work and moving away to a new city) and my colleague is overwhelmed as well :(
We'll get back to the list soon with some feedback about our attempt to
implement your advices.
Season's greetings all!
Le Wed, 9 Dec 2015 18:04:47 -0600,
Ken Gaillot <kgaillot at redhat.com> a écrit :
> On 12/08/2015 05:52 AM, Andrei Borzenkov wrote:
> > On Fri, Dec 4, 2015 at 4:11 PM, Jehan-Guillaume de Rorthais
> > <jgdr at dalibo.com> wrote:
> >
> >>
> >> But I fail to understand how I could distinguish, even from notifications,
> >> a failing scenario from a move/switchover one.
> >>
> >
> > On demote fetch current log position and store it in cluster
> > attribute. On promote fetch previous master position, wait until
> > current instance caught up and delete attribute. If attribute is not
> > present on promote, master was down so do not wait and proceed.
> >
> > If you set transient attribute, cluster will forget about previous
> > master on restart. If you set persistent attribute, it will allow you
> > to ensure no data loss has (automatically) occurred even on cluster
> > restart.
> >
> > Where do you envision problems here?
>
> This is more or less what was suggested in the original post :) and
> after discussing this some more, I tend to agree with this approach
> (using an attribute, as opposed to clone notifications, or the proposed
> migration support for the master role).
>
> The demote action would set an attribute. It would be best to use a
> private attribute (attrd_updater --private --update), so setting it
> doesn't trigger further pacemaker activity. Since the attribute is set
> by demote, it will work whether the move is initiated by the cluster or
> externally (by a sysadmin). To initiate it manually, you can set a
> negative location constraint for the master role on the current master.
>
> The promote action would check for that attribute (attrd_updater
> --private --query --all). If it exists, then it's an orderly handover,
> and it should wait for the replication checkpoint. On success, remove
> the attribute. There should be a timeout on the waiting (less than the
> timeout for the promote operation as a whole), for when there is a
> network issue during the transfer. You could decide whether timeout
> means "grab the master role immediately" or "fail the promote".
>
> I do see the logical appeal of migrate_to/migrate_from for the master
> role, but that would be a long-term project.
More information about the Developers
mailing list