[Pacemaker] A question and demand to a resource placement strategy function

Wed Jul 20 06:13:07 EDT 2011

Hi, Andrew

I confirmed that a problem was revised.

Many thanks!!
Yuusuke

(2011/07/19 10:42), Andrew Beekhof wrote:
> This should also now be fixed in:
>     http://hg.clusterlabs.org/pacemaker/devel/rev/960a7e3da680
>
> On Tue, Jul 5, 2011 at 9:43 PM, Yuusuke IIDA<iidayuus at intellilink.co.jp>  wrote:
>> Hi, Andrew
>>
>> I know that there is the next processing in "pengine".
>>
>> # cat -n pengine/utils.c
>> [snip]
>>    322      /* now try to balance resources across the cluster */
>>    323      if(node1->details->num_resources
>>    324<  node2->details->num_resources) {
>>    325          do_crm_log_unlikely(level, "%s (%d)<  %s (%d) : resources",
>>    326                              node1->details->uname,
>> node1->details->num_resources,
>>    327                              node2->details->uname,
>> node2->details->num_resources);
>>    328          return -1;
>>    329
>>    330      } else if(node1->details->num_resources
>>    331>  node2->details->num_resources) {
>>    332          do_crm_log_unlikely(level, "%s (%d)>  %s (%d) : resources",
>>    333                              node1->details->uname,
>> node1->details->num_resources,
>>    334                              node2->details->uname,
>> node2->details->num_resources);
>>    335          return 1;
>>    336      }
>>
>> This processing is a thing to give priority to a node with a little number
>> of the resources.
>> And this processing acts regardless of setting of "placement-strategy".
>> I understand so it.
>>
>> This processing works expected at the time of next.
>> A turn of the trouble of resources : rsc1 ->  rsc2 ->  rsc3
>>
>> Online: [ act1 act2 act3 sby2 sby1 ]
>>
>> Full list of resources:
>>
>> rsc1    (ocf::pacemaker:Dummy): Started sby1
>> rsc2    (ocf::pacemaker:Dummy): Started sby2
>> rsc3    (ocf::pacemaker:Dummy): Started sby1
>>
>> Failed actions:
>>     rsc1_monitor_5000 (node=act1, call=6, rc=7, status=complete): not running
>>     rsc2_monitor_5000 (node=act2, call=6, rc=7, status=complete): not running
>>     rsc3_monitor_5000 (node=act3, call=6, rc=7, status=complete): not running
>>
>> However , at the time of next , I do not work as expected.(this is a
>> problem)
>> A turn of the trouble of resources : rsc3 ->  rsc2 ->  rsc1
>>
>> Online: [ act1 act2 act3 sby2 sby1 ]
>>
>> Full list of resources:
>>
>> rsc1    (ocf::pacemaker:Dummy): Started sby1
>> rsc2    (ocf::pacemaker:Dummy): Started sby1
>> rsc3    (ocf::pacemaker:Dummy): Started sby1
>>
>> Failed actions:
>>     rsc1_monitor_5000 (node=act1, call=6, rc=7, status=complete): not running
>>     rsc2_monitor_5000 (node=act2, call=6, rc=7, status=complete): not running
>>     rsc3_monitor_5000 (node=act3, call=6, rc=7, status=complete): not running
>>
>> This problem is improved by the correction that Yan made, but it is not
>> applied to "default" setting.
>> I want to apply this correction to "default" setting.
>> And I think that I want Pacemaker-1.0 to apply the same correction.
>>
>> However , I want to think once again because there are the problem of group
>> resources reporting according to the present and the problem of colocation
>> which I do not yet report after including those corrections.
>>
>> I attach crm_report of the problem work.
>>
>> Best Regards,
>> Yuusuke IIDA
>>
>> (2011/07/05 13:34), Andrew Beekhof wrote:
>>>
>>> On Thu, Jun 2, 2011 at 4:59 PM, Gao,Yan<ygao at novell.com>    wrote:
>>>>
>>>> On 06/01/11 18:51, Yuusuke IIDA wrote:
>>>>>
>>>>> Hi, Yan
>>>>>
>>>>> An answer becomes slow, and really I'm sorry.
>>>>>
>>>>> (2011/05/13 15:06), Gao,Yan wrote:
>>>>>>
>>>>>> I understand that you think the improvement for the non-default
>>>>>> placement strategy makes sense to the "default" too. Though the
>>>>>> "default" is somewhat intended not to be affected by any "placement
>>>>>> strategy" so that the behaviors of existing pengine test cases and
>>>>>> users' deployments remain unchanged.
>>>>>
>>>>> I think that a function dispersed with the number of the start of the
>>>>> resource has a problem at the time of "default" setting.
>>>>>
>>>>> This problem is the Pacemaker-1.0 series, but does the same movement.
>>>>> If it could be settled by this correction, I thought a correction to be
>>>>> applicable in Pacemaker-1.0.
>>>>>
>>>>> Should not this problem be revised?
>>>>
>>>> This would affect dozens of existing regression tests, although most of
>>>> the changes are just the scores of clone instances, which are due to
>>>> different resource allocating orders. Given 1.0 is in such a maintenance
>>>> state, I'm not sure we should do that for 1.0.
>>>>
>>>> Andrew, what do you think about it? Perhaps we should fix the
>>>> resource-number-balancing for "default" strategy in 1.1 at least?
>>>
>>> I think for 1.1 we can do something, I'd just like to understand the
>>> the implications of the patch.
>>> It would help if there was a testcase that illustrated the negative
>>> behaviour.
>>>
>>> Is it necessary that both parts of the old if-block are always run?
>>>
>>>>
>>>>>
>>>>>>
>>>>>> For "utilization" strategy, load-balancing is still done based on the
>>>>>> number of resources allocated to a node. That might be a choice.
>>>>>>
>>>>> When I do not set capacity by "utilization" setting in Pacemaker-1.1 ,
>>>>> expected movement is possible!
>>>>>
>>>>> Best Regards,
>>>>> Yuusuke IIDA
>>>>>
>>>>
>>>> Regards,
>>>>   Yan
>>>> --
>>>> Gao,Yan<ygao at novell.com>
>>>> Software Engineer
>>>> China Server Team, SUSE.
>>>>
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs:
>>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>>
>>
>> --
>> ----------------------------------------
>> METRO SYSTEMS CO., LTD
>>
>> Yuusuke Iida
>> Mail: iidayuus at intellilink.co.jp
>> ----------------------------------------
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs:
>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>
>>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>

-- 
----------------------------------------
METRO SYSTEMS CO., LTD

Yuusuke Iida
Mail: iidayuus at intellilink.co.jp
----------------------------------------