[Pacemaker] colocation that doesn't

Fri Nov 5 13:45:38 EDT 2010

The attached patch shows how I'm debugged this problem.
First enable pengine debug prints by changing the argument to
crm_log_init from LOG_INFO to LOG_DEBUG_6.
Then you will notice that filter_colocation_constraint() drops the
colocation rule because myprim has an "Unkown" role.
I then modified the print in rsc_colocation_new() to display the role
and confirmed the colocation object was created that way.
While I was there I notice the code above the print that converts
"Started" to "Unkown".
I'm not sure what will break now that I remove this code, but that is
what I'm testing with.
Alan

myprim  (ocf::pacemaker:DummySlow):     Started node6.acme.com
 Master/Slave Set: mystateful-ms
     Masters: [ node5.acme.com ]
     Slaves: [ node6.acme.com ]

On Fri, Nov 5, 2010 at 1:31 AM, Pavlos Parissis
<pavlos.parissis at gmail.com> wrote:
> On 5 November 2010 04:07, Vadym Chepkov <vchepkov at gmail.com> wrote:
>>
>> On Nov 4, 2010, at 12:53 PM, Alan Jones wrote:
>>
>> > If I understand you correctly, the role of the second resource in the
>> > colocation command was defaulting to that of the first "Master" which
>> > is not defined or is untested for none-ms resources.
>> > Unfortunately, after changed that line to:
>> >
>> > colocation mystateful-ms-loc inf: mystateful-ms:Master myprim:Started
>> >
>> > ...it still doesn't work:
>> >
>> > myprim  (ocf::pacemaker:DummySlow):     Started node6.acme.com
>> > Master/Slave Set: mystateful-ms
>> >     Masters: [ node5.acme.com ]
>> >     Slaves: [ node6.acme.com ]
>> >
>> > And after:
>> > location myprim-loc myprim -inf: node5.acme.com
>> >
>> > myprim  (ocf::pacemaker:DummySlow):     Started node6.acme.com
>> > Master/Slave Set: mystateful-ms
>> >     Masters: [ node6.acme.com ]
>> >     Slaves: [ node5.acme.com ]
>> >
>> > What I would like to do is enable logging for the code that calculates
>> > the weights, etc.
>> > It is obvious to me that the weights are calculated differently for
>> > mystateful-ms based on the weights used in myprim.
>> > Can you enable more verbose logging online or do you have to recompile?
>> > My version is 1.0.9-89bd754939df5150de7cd76835f98fe90851b677 which is
>> > different from Vadym's.
>> > BTW: Is there another release planned for the stable branch?  1.0.9.1
>> > is now 4 months old.
>> > I understand that I could take the top of tree, but I would like to
>> > believe that others are running the same version. ;)
>> > Thank you!
>> > Alan
>> >
>> > On Thu, Nov 4, 2010 at 8:22 AM, Dejan Muhamedagic <dejanmm at fastmail.fm> wrote:
>> >> Hi,
>> >>
>> >> On Thu, Nov 04, 2010 at 06:51:59AM -0400, Vadym Chepkov wrote:
>> >>> On Thu, Nov 4, 2010 at 5:37 AM, Dejan Muhamedagic <dejanmm at fastmail.fm> wrote:
>> >>>
>> >>>> This should be:
>> >>>>
>> >>>> colocation mystateful-ms-loc inf: mystateful-ms:Master myprim:Started
>> >>>>
>> >>>
>> >>> Interesting, so in this case it is not necessary?
>> >>>
>> >>> colocation fs_on_drbd inf: WebFS WebDataClone:Master
>> >>> (taken from Cluster_from_Scratch)
>> >>>
>> >>> but other way around it is?
>> >>
>> >> Yes, the role of the second resource defaults to the role of the
>> >> first. Ditto for order and actions. A bit confusing, I know.
>> >>
>> >> Thanks,
>> >>
>> >> Dejan
>> >>
>>
>>
>> I did it a bit different this time and I observe the same anomaly.
>>
>> First I started stateful clone
>>
>> primitive s1 ocf:pacemaker:Stateful
>> ms ms1 s1 meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
>>
>> Then a primitive:
>>
>> primitive d1 ocf:pacemaker:Dummy
>>
>> Made sure Master and primitive are running on different hosts
>> location ld1 d1 10: xen-12
>>
>> and then I added constraint
>> colocation c1 inf: ms1:Master d1:Started
>>
>>  Master/Slave Set: ms1
>>     Masters: [ xen-11 ]
>>     Slaves: [ xen-12 ]
>>  d1     (ocf::pacemaker:Dummy): Started xen-12
>>
>>
>> It seems colocation constraint is not enough to promote a clone. Looks like a bug.
>>
>> # ptest -sL|grep s1
>> clone_color: ms1 allocation score on xen-11: 0
>> clone_color: ms1 allocation score on xen-12: 0
>> clone_color: s1:0 allocation score on xen-11: 11
>> clone_color: s1:0 allocation score on xen-12: 0
>> clone_color: s1:1 allocation score on xen-11: 0
>> clone_color: s1:1 allocation score on xen-12: 6
>> native_color: s1:0 allocation score on xen-11: 11
>> native_color: s1:0 allocation score on xen-12: 0
>> native_color: s1:1 allocation score on xen-11: -1000000
>> native_color: s1:1 allocation score on xen-12: 6
>> s1:0 promotion score on xen-11: 20
>> s1:1 promotion score on xen-12: 20
>>
>> Vadym
>>
>>
>
> I have seen the same symptom when I used crm resource move to failover
> an group who has colocation constraints for a ms resource.
> In my case, I had to recreate the conf and the move worked!
> See here http://developerbugs.linux-foundation.org/show_bug.cgi?id=2500
>
> Pavlos
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Pacemaker-1-0-74392a28b7f3.patch
Type: text/x-patch
Size: 1790 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20101105/99b16c6c/attachment-0003.bin>