[ClusterLabs] Antw: Re: Q: Resource balancing opration

Thu Apr 21 02:56:20 EDT 2016

>>> Ken Gaillot <kgaillot at redhat.com> schrieb am 20.04.2016 um 16:44 in Nachricht
<571795E5.4090004 at redhat.com>:
> On 04/20/2016 01:17 AM, Ulrich Windl wrote:
>> Hi!
>> 
>> I'm wondering: If you boot a node on a cluster, most resources will go to 
> another node (if possible). Due to stickiness configured, those resources 
> will stay there.
>> So I'm wondering whether or how I could cause a rebalance of resources on 
> the cluster. I must admit that I don't understand the details of stickiness 
> related to other parameters. In my understanding stickiness should be related 
> to a percentage of utilization dynamically, so that a resource running on a 
> node that is "almost full" should dynamically lower its stickiness to allow 
> resource migration.
>> 
>> So if you are going to implement a manual resource rebalance operation, 
> could you dynamically lower the stickiness for each resource (by some amount 
> or some factor), wait if something happens, and then repeat the process until 
> resources look balanced. "Looking balanced" should be no worse as if all 
> resources are started when all cluster nodes are up.
>> 
>> Spontaneous pros and cons for "resource rebalancing"?
>> 
>> Regards,
>> Ulrich
> 
> Pacemaker gives you a few levers to pull. Stickiness and utilization
> attributes (with a placement strategy) are the main ones.
> 
> Normally, pacemaker *will* continually rebalance according to what nodes
> are available. Stickiness tells the cluster not to do that.
> 
> Whether you should use stickiness (and how much) depends mainly on how
> significant is the interruption that occurs when a service is moved. For

We agree on this: What I was asking for was a "manually triggered automatic rebalance" that would temporarily override the stickiness parameters set for the resources. Manually means people will blame me, not the cluster (at least for starting the operation ;-)) if something bad happens.

> a large database supporting a high-traffic website, stopping and
> starting can take a long time and cost a lot of business -- so maybe you
> want an infinite stickiness in that case, and only rebalance manually
> during a scheduled window. For a small VM that can live-migrate quickly
> and doesn't affect any of your customer-facing services, maybe you don't
> mind setting a small or zero stickiness.
> 
> You can also use rules to make the process intelligent. For example, for
> a server that provides office services, you could set a rule that sets
> infinite stickiness during business hours, and small or zero stickiness
> otherwise. That way, you'd get no disruptions when people are actually
> using the service during the day, and at night, it would automatically
> rebalance.

Could you give a concrete example for this?

> 
> Normally, pacemaker's idea of "balancing" is to simply distribute the
> number of resources on each node as equally as possible. Utilization
> attributes and placement strategies let you add more intelligence. For
> example, you can define the number of cores per node or the amount of
> RAM per node, along with how much each resource is expected to use, and
> let pacemaker balance by that instead of just counting the number of
> resources.

Knew that; I was specifically talking about the imbalance that occurs after one node was down for service: If capacity allows the remaining nodes will run all the services for the downed node, and they will stay there even if the node is up again.

Usually I want to avoid moving the resources, e.g. when one resource does down (or up), causing an imbalance, causing other resources to be moved in turn. Specifically if you know the resource will be back up (down) soon.

I guess it's not possible to delay the rebalancing effect.

Regards,
Ulrich

> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org 
> http://clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org