[ClusterLabs] Load balancing, of a sort

Andrei Borzenkov arvidjaar at gmail.com
Wed Jan 25 08:16:37 EST 2023


On Wed, Jan 25, 2023 at 3:49 PM Antony Stone
<Antony.Stone at ha.open.source.it> wrote:
>
> Hi.
>
> I have a corosync / pacemaker 3-node cluster with a resource group which can
> run on any node in the cluster.
>
> Every night a cron job on the node which is running the resources performs
> "crm_standby -v on" followed a short while later by "crm_standby -v off" in
> order to force the resources to migrate to another node member.
>
> We do this partly to verify that all nodes are capable of running the
> resources, and partly because some of those resources generate significant log
> files, and if one machine just keeps running them day after day, we run out of
> disk space (which effectively means we just need to add more capacity to the
> machines, which can be done, but at a cost).
>
> So long as a machine gets a day when it's not running the resources, a
> combination of migrating the log files to a central server, plus standard
> logfile rotation, takes care of managing the disk space.
>
> What I notice, though, is that two of the machines tend to swap the resources
> between them, and the third machine hardly ever becomes the active node.
>

Pacemaker simply checks each eligible node whether it can run a
resource and I believe the order of the node list does not change (at
least as long as there is no join/leave event). So effectively the
resource just oscillates between the first two nodes in the list.

> Is there some way of influencing the node selection mechanism when resources
> need to move away from the currently active node, so that, for example, the
> least recently used node could be favoured over the rest?
>

I do not think pacemaker even knows which node is "the least recently
used", it does not keep this history. You can add a rule to define
location constraint based on some node attribute(s) and set this
attribute in the same script where you call crm_standby. E.g. you
could set a timestamp on the node where the resource is currently
active before doing crm_standby and select the node with the oldest
timestamp (I do not think pacemaker supports such computation in its
rules).


More information about the Users mailing list