[Pacemaker] A question and demand to a resource placement strategy function

Tue Jul 5 07:43:28 EDT 2011

Hi, Andrew

I know that there is the next processing in "pengine".

# cat -n pengine/utils.c
[snip]
    322      /* now try to balance resources across the cluster */
    323      if(node1->details->num_resources
    324         < node2->details->num_resources) {
    325          do_crm_log_unlikely(level, "%s (%d) < %s (%d) : resources",
    326                              node1->details->uname, 
node1->details->num_resources,
    327                              node2->details->uname, 
node2->details->num_resources);
    328          return -1;
    329
    330      } else if(node1->details->num_resources
    331                > node2->details->num_resources) {
    332          do_crm_log_unlikely(level, "%s (%d) > %s (%d) : resources",
    333                              node1->details->uname, 
node1->details->num_resources,
    334                              node2->details->uname, 
node2->details->num_resources);
    335          return 1;
    336      }

This processing is a thing to give priority to a node with a little number of 
the resources.
And this processing acts regardless of setting of "placement-strategy".
I understand so it.

This processing works expected at the time of next.
A turn of the trouble of resources : rsc1 -> rsc2 -> rsc3

Online: [ act1 act2 act3 sby2 sby1 ]

Full list of resources:

rsc1    (ocf::pacemaker:Dummy): Started sby1
rsc2    (ocf::pacemaker:Dummy): Started sby2
rsc3    (ocf::pacemaker:Dummy): Started sby1

Failed actions:
     rsc1_monitor_5000 (node=act1, call=6, rc=7, status=complete): not running
     rsc2_monitor_5000 (node=act2, call=6, rc=7, status=complete): not running
     rsc3_monitor_5000 (node=act3, call=6, rc=7, status=complete): not running

However , at the time of next , I do not work as expected.(this is a problem)
A turn of the trouble of resources : rsc3 -> rsc2 -> rsc1

Online: [ act1 act2 act3 sby2 sby1 ]

Full list of resources:

rsc1    (ocf::pacemaker:Dummy): Started sby1
rsc2    (ocf::pacemaker:Dummy): Started sby1
rsc3    (ocf::pacemaker:Dummy): Started sby1

Failed actions:
     rsc1_monitor_5000 (node=act1, call=6, rc=7, status=complete): not running
     rsc2_monitor_5000 (node=act2, call=6, rc=7, status=complete): not running
     rsc3_monitor_5000 (node=act3, call=6, rc=7, status=complete): not running

This problem is improved by the correction that Yan made, but it is not applied 
to "default" setting.
I want to apply this correction to "default" setting.
And I think that I want Pacemaker-1.0 to apply the same correction.

However , I want to think once again because there are the problem of group 
resources reporting according to the present and the problem of colocation which 
I do not yet report after including those corrections.

I attach crm_report of the problem work.

Best Regards,
Yuusuke IIDA

(2011/07/05 13:34), Andrew Beekhof wrote:
> On Thu, Jun 2, 2011 at 4:59 PM, Gao,Yan<ygao at novell.com>  wrote:
>> On 06/01/11 18:51, Yuusuke IIDA wrote:
>>> Hi, Yan
>>>
>>> An answer becomes slow, and really I'm sorry.
>>>
>>> (2011/05/13 15:06), Gao,Yan wrote:
>>>> I understand that you think the improvement for the non-default
>>>> placement strategy makes sense to the "default" too. Though the
>>>> "default" is somewhat intended not to be affected by any "placement
>>>> strategy" so that the behaviors of existing pengine test cases and
>>>> users' deployments remain unchanged.
>>> I think that a function dispersed with the number of the start of the
>>> resource has a problem at the time of "default" setting.
>>>
>>> This problem is the Pacemaker-1.0 series, but does the same movement.
>>> If it could be settled by this correction, I thought a correction to be
>>> applicable in Pacemaker-1.0.
>>>
>>> Should not this problem be revised?
>> This would affect dozens of existing regression tests, although most of
>> the changes are just the scores of clone instances, which are due to
>> different resource allocating orders. Given 1.0 is in such a maintenance
>> state, I'm not sure we should do that for 1.0.
>>
>> Andrew, what do you think about it? Perhaps we should fix the
>> resource-number-balancing for "default" strategy in 1.1 at least?
>
> I think for 1.1 we can do something, I'd just like to understand the
> the implications of the patch.
> It would help if there was a testcase that illustrated the negative behaviour.
>
> Is it necessary that both parts of the old if-block are always run?
>
>>
>>>
>>>>
>>>> For "utilization" strategy, load-balancing is still done based on the
>>>> number of resources allocated to a node. That might be a choice.
>>>>
>>> When I do not set capacity by "utilization" setting in Pacemaker-1.1 ,
>>> expected movement is possible!
>>>
>>> Best Regards,
>>> Yuusuke IIDA
>>>
>>
>> Regards,
>>   Yan
>> --
>> Gao,Yan<ygao at novell.com>
>> Software Engineer
>> China Server Team, SUSE.
>>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>

-- 
----------------------------------------
METRO SYSTEMS CO., LTD

Yuusuke Iida
Mail: iidayuus at intellilink.co.jp
----------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pcmk-Tue-05-Jul-2011.tar.bz2
Type: application/octet-stream
Size: 372149 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20110705/655f7b94/attachment-0003.obj>