[Pacemaker] Advisory ordering and "Cannot migrate"

Wed May 30 00:26:35 EDT 2012

30.05.2012 01:37, David Vossel wrote:
> 
> 
> ----- Original Message -----
>> From: "Vladislav Bogdanov" <bubble at hoster-ok.com>
>> To: pacemaker at oss.clusterlabs.org
>> Sent: Tuesday, May 29, 2012 3:48:12 PM
>> Subject: Re: [Pacemaker] Advisory ordering and "Cannot migrate"
>>
>> 29.05.2012 18:51, David Vossel wrote:
>>> ----- Original Message -----
>>>> From: "Vladislav Bogdanov" <bubble at hoster-ok.com>
>>>> To: "The Pacemaker cluster resource manager"
>>>> <pacemaker at oss.clusterlabs.org>
>>>> Sent: Tuesday, May 29, 2012 7:27:12 AM
>>>> Subject: [Pacemaker] Advisory ordering and "Cannot migrate"
>>>>
>>>> Hi Andrew, David, all,
>>>>
>>>> It seems that advisory ordering is honored when pengine wants to
>>>> move
>>>> two advisory-ordered resources in one transition, and one of
>>>> resources
>>>> (then) is migrateable.
>>>>
>>>> I have advisory ordering configured for two resources, "mgs" and
>>>> "drbd-testfs-stacked":
>>>>
>>>> order drbd-testfs-stacked-after-mgs 0: mgs:start
>>>> drbd-testfs-stacked:start
>>>>
>>>> "mgs" is ordinary resource, "drbd-testfs-stacked" is migrateable.
>>>>
>>>> If both that resources are located on one node, and I request
>>>> shutdown
>>>> of that node, I see:
>>>> pengine[2069]:   notice: check_stack_element: Cannot migrate
>>>> drbd-testfs-stacked due to dependency on mgs (order)
>>>>
>>>> From what I understand, symmetrical advisory ordering should
>>>> affect
>>>> resources which are about to be both started or both stopped in
>>>> one
>>>> transition. That's fine.
>>>>
>>>> But, should it be honored when one resource is to be moved with
>>>> start/stop while another is to be migrated?
>>>
>>> I would expect the constraint to be honored.  What else could we
>>> possibly do that would make sense?
>>>
>>> If you have the following symmetrical order constraint,
>>>
>>> start A then start B
>>> stop B then stop A
>>>
>>> , where B can be migrated but A can not. I would expect B to be
>>> stopped before A is allowed to stop regardless if B has be ability
>>> to be
>>> migrated or not. If both A and B were to be moved to a different
>>> node,
>>> and B was migrated instead of stop/started, that would invalidate
>>> both
>>> sides of the order constraint.
>>
>> That is absolutely valid, but for _mandatory_ ordering, isn't it?
>>
>> For _advisory_ one that would be
>> If you're about to start A and B at the same time, then start A
>> first.
>> Otherwise skip this constraint. Do the same in the opposite direction
>> for 'stop'.
> 
> Yeah I missed the advisory part of this.
> 
> I bet this suffers from the same implementation complications that
> http://bugs.clusterlabs.org/show_bug.cgi?id=5055 has. This will likely
> resolve itself once 5055 gets fixed... or we might be able to make a
> temporary targeted fix for this before then.

That would be great.
>From what I see in code, all oreder-related actions for migration are
added at the end of MigrateRsc() with comments why that is done:
/* migrate 'then' action also needs anything that the stop needed to
have completed too */
/* migrate 'then' action also needs anything that the start needed to
have completed too */

May be that function is a good place too do that?

Vladislav

> 
> The migration operations don't actually get calculated by the policy
> engine. All moves are calculated as "stop/start" internally and then at
> the end of all the calculations we attempt to detect when we can migrate
> based on the final graph.
> 
> -- Vossel
> 
>> At the other hand, migration is a very special operation I'd say. It
>> is
>> not a stop on one node and start on another one, but a magical move
>> of
>> resource. It was running on one node and oops, it runs on another. It
>> does neither stop nor start, does not allocate or free resources. It
>> just appears in a different place.
>> Most common example of resource migration is live VM migration.
>> And last part of migration (actual switch to another node) is done
>> almost atomically nowadays. VM just appears on a different node.
>>
>> Another example of a "real life" migration (not in clusters) would be
>> remote desktop, where you start application on a remote server being
>> at
>> work, then you come home, connect to that desktop and see your
>> application. You do not stop and do not start it. You just see that
>> your
>> window is "migrated" to another place.
>>
>> For pacemaker this could be illustrated with some resource agent
>> which
>> manages some external entity which exists "somewhere". And "resource
>> is
>> running on node A" just means that that entity is managed from node
>> A.
>> It does not run there, it is just managed from there. In my case that
>> entity is a pacemaker ticket. It exists everywhere in the cluster,
>> but
>> is granted or revoked by a script (resource agent) running on one
>> node.
>> I do not want that ticket to be ever touched without my explicit
>> command. I just allow to migrate the management point.
>>
>> So, from my PoV, term "migration" does not assume any start or stop
>> operations. And advisory ordering (which applies to simultaneous
>> start
>> or stop of two different resources) should not be honored because
>> only
>> one resource is actually starting/stopping. Do I miss something here?
>>
>> Best,
>> Vladislav
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started:
>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org