[Pacemaker] Unique clone instance is stopped too early on move

Wed Jan 21 08:04:09 EST 2015

20.01.2015 02:44, Andrew Beekhof wrote:
> 
>> On 16 Jan 2015, at 3:59 pm, Vladislav Bogdanov <bubble at hoster-ok.com> wrote:
>>
>> 16.01.2015 07:44, Andrew Beekhof wrote:
>>>
>>>> On 15 Jan 2015, at 3:11 pm, Vladislav Bogdanov <bubble at hoster-ok.com> wrote:
>>>>
>>>> 13.01.2015 11:32, Andrei Borzenkov wrote:
>>>>> On Tue, Jan 13, 2015 at 10:20 AM, Vladislav Bogdanov
>>>>> <bubble at hoster-ok.com> wrote:
>>>>>> Hi Andrew, David, all.
>>>>>>
>>>>>> I found a little bit strange operation ordering during transition execution.
>>>>>>
>>>>>> Could you please look at the following partial configuration (crmsh syntax)?
>>>>>>
>>>>>> ===
>>>>>> ...
>>>>>> clone cl-broker broker \
>>>>>>          meta interleave=true target-role=Started
>>>>>> clone cl-broker-vips broker-vips \
>>>>>>          meta clone-node-max=2 globally-unique=true interleave=true resource-stickiness=0 target-role=Started
>>>>>> clone cl-ctdb ctdb \
>>>>>>          meta interleave=true target-role=Started
>>>>>> colocation broker-vips-with-broker inf: cl-broker-vips cl-broker
>>>>>> colocation broker-with-ctdb inf: cl-broker cl-ctdb
>>>>>> order broker-after-ctdb inf: cl-ctdb cl-broker
>>>>>> order broker-vips-after-broker 0: cl-broker cl-broker-vips
>>>>>> ...
>>>>>> ===
>>>>>>
>>>>>> After I put one node to standby and then back to online, I see the following transition (relevant excerpt):
>>>>>>
>>>>>> ===
>>>>>>   * Pseudo action:   cl-broker-vips_stop_0
>>>>>>   * Resource action: broker-vips:1   stop on c-pa-0
>>>>>>   * Pseudo action:   cl-broker-vips_stopped_0
>>>>>>   * Pseudo action:   cl-ctdb_start_0
>>>>>>   * Resource action: ctdb            start on c-pa-1
>>>>>>   * Pseudo action:   cl-ctdb_running_0
>>>>>>   * Pseudo action:   cl-broker_start_0
>>>>>>   * Resource action: ctdb            monitor=10000 on c-pa-1
>>>>>>   * Resource action: broker          start on c-pa-1
>>>>>>   * Pseudo action:   cl-broker_running_0
>>>>>>   * Pseudo action:   cl-broker-vips_start_0
>>>>>>   * Resource action: broker          monitor=10000 on c-pa-1
>>>>>>   * Resource action: broker-vips:1   start on c-pa-1
>>>>>>   * Pseudo action:   cl-broker-vips_running_0
>>>>>>   * Resource action: broker-vips:1   monitor=30000 on c-pa-1
>>>>>> ===
>>>>>>
>>>>>> What could be a reason to stop unique clone instance so early for move?
>>>>>>
>>>>>
>>>>> Do not take it as definitive answer, but cl-broker-vips cannot run
>>>>> unless both other resources are started. So if you compute closure of
>>>>> all required transitions it looks rather logical. Having
>>>>> cl-broker-vips started while broker is still stopped would violate
>>>>> constraint.
>>>>
>>>> Problem is that broker-vips:1 is stopped on one (source) node unnecessarily early.
>>>
>>> It looks to be moving from c-pa-0 to c-pa-1
>>> It might be unnecessarily early, but it is what you asked for... we have to unwind the resource stack before we can build it up.
>>
>> Yes, I understand that it is valid, but could its stop be delayed until cluster is in the state when all dependencies are satisfied to start it on another node (like migration?)?
> 
> No, because "we have to unwind the resource stack before we can build it up."
> Doing anything else would be one of those things that is trivial for a human to identify but rather complex for a computer.

I believe there is also an issue with migration of clone instances.

I modified pe-input to allow migration of cl-broker-vips (and also set inf score for broker-vips-after-broker
and make cl-broker-vips interleaved).
Relevant part is:
clone cl-broker broker \
        meta interleave=true target-role=Started
clone cl-broker-vips broker-vips \
        meta clone-node-max=2 globally-unique=true interleave=true allow-migrate=true resource-stickiness=0 target-role=Started
clone cl-ctdb ctdb \
        meta interleave=true target-role=Started
colocation broker-vips-with-broker inf: cl-broker-vips cl-broker
colocation broker-with-ctdb inf: cl-broker cl-ctdb
order broker-after-ctdb inf: cl-ctdb cl-broker
order broker-vips-after-broker inf: cl-broker cl-broker-vips

After that (part of) transition is:

 * Resource action: broker-vips:1   migrate_to on c-pa-0
 * Pseudo action:   cl-broker-vips_stop_0
 * Resource action: broker-vips:1   migrate_from on c-pa-1
 * Resource action: broker-vips:1   stop on c-pa-0
 * Pseudo action:   cl-broker-vips_stopped_0
 * Pseudo action:   all_stopped
 * Pseudo action:   cl-ctdb_start_0
 * Resource action: ctdb            start on c-pa-1
 * Pseudo action:   cl-ctdb_running_0
 * Pseudo action:   cl-broker_start_0
 * Resource action: ctdb            monitor=10000 on c-pa-1
 * Resource action: broker          start on c-pa-1
 * Pseudo action:   cl-broker_running_0
 * Pseudo action:   cl-broker-vips_start_0
 * Resource action: broker          monitor=10000 on c-pa-1
 * Pseudo action:   broker-vips:1_start_0
 * Pseudo action:   cl-broker-vips_running_0
 * Resource action: broker-vips:1   monitor=30000 on c-pa-1

But, I would say that at least from a human logic PoV the above breaks ordering rule broker-vips-after-broker
(cl-broker-vips finished migrating and thus runs on c-pa-1 before cl-broker started there).
Technically broker-vips:1_start_0 goes at the right position, but actually resource is "started"
in migrate_to/mifrate_from.

I also went further and injected a pair of non-clone IPAddr2 resources into the same pe-input, and also enabled migration
for them (returning interleave for cl-broker-vips to false and setting ordering score for broker-vips-after-broker back to 0,
so all three order constraints are adjacent):

clone cl-broker broker \
        meta interleave=true target-role=Started
clone cl-broker-vips broker-vips \
        meta clone-node-max=2 globally-unique=true interleave=false allow-migrate=true resource-stickiness=0 target-role=Started
clone cl-ctdb ctdb \
        meta interleave=true target-role=Started
primitive broker-vip1 IPaddr2 \
        params ip=192.168.122.70 cidr_netmask=24 nic=eth0 \
        op start interval=0 timeout=20 \
        op stop interval=0 timeout=20 \
        op monitor interval=30
primitive broker-vip2 IPaddr2 \
        params ip=192.168.122.71 cidr_netmask=24 nic=eth0 \
        op start interval=0 timeout=20 \
        op stop interval=0 timeout=20 \
        op monitor interval=30
colocation broker-with-ctdb inf: cl-broker cl-ctdb
colocation broker-vips-with-broker inf: cl-broker-vips cl-broker
colocation broker-vip1-with-broker inf: broker-vip1 cl-broker
colocation broker-vip2-with-broker inf: broker-vip2 cl-broker
colocation broker-vip2-not-with-vip1 -100: broker-vip2 broker-vip1
order broker-after-ctdb inf: cl-ctdb cl-broker
order broker-vips-after-broker 0: cl-broker cl-broker-vips
order broker-vip1-after-broker 0: cl-broker broker-vip1
order broker-vip2-after-broker 0: cl-broker broker-vip2

For broker-vip2 I see completely different output (compare with broker-vips:1):

 * Resource action: broker-vips:1   migrate_to on c-pa-0
 * Pseudo action:   cl-broker-vips_stop_0
 * Resource action: broker-vips:1   migrate_from on c-pa-1
 * Resource action: broker-vips:1   stop on c-pa-0
 * Pseudo action:   cl-broker-vips_stopped_0
 * Pseudo action:   cl-ctdb_start_0
 * Resource action: ctdb            start on c-pa-1
 * Pseudo action:   cl-ctdb_running_0
 * Pseudo action:   cl-broker_start_0
 * Resource action: ctdb            monitor=10000 on c-pa-1
 * Resource action: broker          start on c-pa-1
 * Pseudo action:   cl-broker_running_0
 * Resource action: broker-vip2     migrate_to on c-pa-0
 * Pseudo action:   cl-broker-vips_start_0
 * Resource action: broker          monitor=10000 on c-pa-1
 * Resource action: broker-vip2     migrate_from on c-pa-1
 * Resource action: broker-vip2     stop on c-pa-0
 * Pseudo action:   broker-vips:1_start_0
 * Pseudo action:   cl-broker-vips_running_0
 * Pseudo action:   all_stopped
 * Pseudo action:   broker-vip2_start_0
 * Resource action: broker-vips:1   monitor=30000 on c-pa-1
 * Resource action: broker-vip2     monitor=30000 on c-pa-1

broker-vip2 is migrated much later than broker-vips:1, exactly at the point I would expect to see.

For me that means that some logic already exists which would allow to postpone resource move until
everything is ready for it at the destination.

I also tried to disable migration for broker-vip2, and in that case it was also stopped too early.

So, there are four cases, and for one of them I get expected result:
*) g-u clone, migration disabled         - early stop
*) g-u clone, migration enabled          - early stop
*) ordinary resource, migration disabled - early stop
*) ordinary resource, migration enabled  - stop at the expected point

The question is:

Is it strictly impossible to make non-migratable resources behave the same way as that migratable broker-vip2?

(I'm pretty sure I didn't make a mess in details anywhere but I want to recheck that all once again)

Best,
Vladislav

> 
> Better to look at why broker-vips:1 needed to be moved.
> 
>>
>> Like:
>> ===
>> * Pseudo action:   cl-ctdb_start_0
>> * Resource action: ctdb            start on c-pa-1
>> * Pseudo action:   cl-ctdb_running_0
>> * Pseudo action:   cl-broker_start_0
>> * Resource action: ctdb            monitor=10000 on c-pa-1
>> * Resource action: broker          start on c-pa-1
>> * Pseudo action:   cl-broker_running_0
>> * Pseudo action:   cl-broker-vips_start_0
>> * Resource action: broker          monitor=10000 on c-pa-1
>> * Pseudo action:   cl-broker-vips_stop_0
>> * Resource action: broker-vips:1   stop on c-pa-0
>> * Pseudo action:   cl-broker-vips_stopped_0
>> * Resource action: broker-vips:1   start on c-pa-1
>> * Pseudo action:   cl-broker-vips_running_0
>> * Resource action: broker-vips:1   monitor=30000 on c-pa-1
>> ===
>> That would be the great optimization toward five nines...
>>
>> Best,
>> Vladislav
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>