[ClusterLabs Developers] migrate-to and migrate-from for moving Master/Slave roles ?

Mon Nov 30 16:04:37 EST 2015

On 11/25/2015 06:52 PM, Jehan-Guillaume de Rorthais wrote:
> Hi guys,
> 
> While working on our pgsqlms agent[1], we are now studying how to control all
> the steps of a switchover process from the resource agent. 
> 
> The tricky part here is the 2nd step of a successful swithover with PostgreSQL
> (9.3+):
>   (1) shutdown the master first
>   (2) make sure the designated slave received **everything** from the old master
>   (3) promote the designated slave as master
>   (4) start the old master as slave
> 
> As far as we understand Pacemaker, migrate-to and migrate-from capabilities
> allows to distinguish if we are moving a resource because of a failure or for a
> controlled switchover situation. Unfortunately, these capabilities are ignored
> for cloned and multi-state resources...
> 
> Because of this restriction, we currently don't know from the resource agent
> code if we should check the designated slave received everything from the old
> master (controlled switchover) or not (we lost the master). In case of
> controlled switchover, if the designated slave did not received everything from
> the master, we must abort the switchover.
> 
> A workaround we could imagine would be to set a special cluster attribute
> manually (using crm_attribute) to signal the agent we are going to make a
> controlled switchover.
> 
> But I bet the cleaner way would be to use migrate-to and migrate-from
> capabilities. Did we miss something about them? Is there some plan to support
> moving a Master/Slave role using migrate-to and migrate-from at some point? Any
> other proposal? ideas?
> 
> [1] see "multistate" folder in https://github.com/dalibo/pgsql-resource-agent

Per the documentation, clones can't migrate:
http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html/Pacemaker_Explained/_migrating_resources.html

It would be nice to support migration for globally unique clones, and
the master role of stateful clones. Feel free to submit a feature
request with what you think the interface should look like.

The attribute approach is interesting, but it would be limited to moves
initiated outside the cluster, and I suspect error handling would be
problematic (what if someone forgets to unset the attribute? what if one
part of the process fails?).

I'm not sure how other db RAs deal with the situation; that would be
worth looking into.