[ClusterLabs] Q: placement-strategy=balanced

Mon Jan 18 13:20:28 EST 2021

On Fri, 2021-01-15 at 09:36 +0100, Ulrich Windl wrote:
> Hi!
> 
> The cluster I'm configuring (SLES15 SP2) fenced a node last night.
> Still unsure what exactly caused the fencing, but looking at the logs
> I found this "action plan" that lead to fencing:
> 
> Jan 14 20:05:12 h19 pacemaker-schedulerd[4803]:  notice:  *
> Move       prm_cron_snap_test-jeos1              ( h18 -> h19 )
> Jan 14 20:05:12 h19 pacemaker-schedulerd[4803]:  notice:  *
> Move       prm_cron_snap_test-jeos2              ( h19 -> h16 )
> Jan 14 20:05:12 h19 pacemaker-schedulerd[4803]:  notice:  *
> Move       prm_cron_snap_test-jeos3              ( h16 -> h18 )
> Jan 14 20:05:12 h19 pacemaker-schedulerd[4803]:  notice:  *
> Move       prm_cron_snap_test-jeos4              ( h18 -> h19 )
> Jan 14 20:05:12 h19 pacemaker-schedulerd[4803]:  notice:  *
> Migrate    prm_xen_test-jeos1                    ( h18 -> h19 )
> Jan 14 20:05:12 h19 pacemaker-schedulerd[4803]:  notice:  *
> Migrate    prm_xen_test-jeos2                    ( h19 -> h16 )
> Jan 14 20:05:12 h19 pacemaker-schedulerd[4803]:  notice:  *
> Migrate    prm_xen_test-jeos3                    ( h16 -> h18 )
> Jan 14 20:05:12 h19 pacemaker-schedulerd[4803]:  notice:  *
> Migrate    prm_xen_test-jeos4                    ( h18 -> h19 )
> 
> Those "cron_snap" resources depend on the corresponding xen resources
> (colocation).
> Having 4 resources to be distributed equally to three nodes seems to
> trigger that problem.
> 
> After fencing the action plan was:
> 
> Jan 14 20:05:26 h19 pacemaker-schedulerd[4803]:  notice:  *
> Move       prm_cron_snap_test-jeos2              ( h16 -> h19 )
> Jan 14 20:05:26 h19 pacemaker-schedulerd[4803]:  notice:  *
> Move       prm_cron_snap_test-jeos4              ( h19 -> h16 )
> Jan 14 20:05:26 h19 pacemaker-schedulerd[4803]:  notice:  *
> Start      prm_cron_snap_test-jeos1              (             h18 )
> Jan 14 20:05:26 h19 pacemaker-schedulerd[4803]:  notice:  *
> Start      prm_cron_snap_test-jeos3              (             h19 )
> Jan 14 20:05:26 h19 pacemaker-schedulerd[4803]:  notice:  *
> Recover    prm_xen_test-jeos1                    ( h19 -> h18 )
> Jan 14 20:05:26 h19 pacemaker-schedulerd[4803]:  notice:  *
> Migrate    prm_xen_test-jeos2                    ( h16 -> h19 )
> Jan 14 20:05:26 h19 pacemaker-schedulerd[4803]:  notice:  *
> Migrate    prm_xen_test-jeos3                    ( h18 -> h19 )
> Jan 14 20:05:26 h19 pacemaker-schedulerd[4803]:  notice:  *
> Migrate    prm_xen_test-jeos4                    ( h19 -> h16 )
> 
> ...some more recoivery actions like that...
> 
> Currently h18 has two VMs, while the other two nodes have one VM
> each.
> 
> Before having added those "cron_snap" resources, I did not detect
> such "rebalancing".
> 
> The rebalancing was triggered by this ruleset present in every xen
> resource:
> 
>         meta 1: resource-stickiness=0 \
>         meta 2: rule 0: date spec hours=7-19 weekdays=1-5 resource-
> stickiness=1000
> 
> At the moment the related scores (crm_simulate -LUs) look like this
> (filtered and re-ordered):
> 
> Original: h16 capacity: utl_ram=231712 utl_cpu=440
> Original: h18 capacity: utl_ram=231712 utl_cpu=440
> Original: h19 capacity: utl_ram=231712 utl_cpu=440
> 
> Remaining: h16 capacity: utl_ram=229664 utl_cpu=420
> Remaining: h18 capacity: utl_ram=227616 utl_cpu=400
> Remaining: h19 capacity: utl_ram=229664 utl_cpu=420
> 
> pcmk__native_allocate: prm_xen_test-jeos1 allocation score on h16: 0
> pcmk__native_allocate: prm_xen_test-jeos1 allocation score on h18:
> 1000
> pcmk__native_allocate: prm_xen_test-jeos1 allocation score on h19:
> -INFINITY
> native_assign_node: prm_xen_test-jeos1 utilization on h18:
> utl_ram=2048 utl_cpu=20
> 
> pcmk__native_allocate: prm_xen_test-jeos2 allocation score on h16: 0
> pcmk__native_allocate: prm_xen_test-jeos2 allocation score on h18:
> 1000
> pcmk__native_allocate: prm_xen_test-jeos2 allocation score on h19: 0
> native_assign_node: prm_xen_test-jeos2 utilization on h18:
> utl_ram=2048 utl_cpu=20
> 
> pcmk__native_allocate: prm_xen_test-jeos3 allocation score on h16: 0
> pcmk__native_allocate: prm_xen_test-jeos3 allocation score on h18: 0
> pcmk__native_allocate: prm_xen_test-jeos3 allocation score on h19:
> 1000
> native_assign_node: prm_xen_test-jeos3 utilization on h19:
> utl_ram=2048 utl_cpu=20
> 
> pcmk__native_allocate: prm_xen_test-jeos4 allocation score on h16:
> 1000
> pcmk__native_allocate: prm_xen_test-jeos4 allocation score on h18: 0
> pcmk__native_allocate: prm_xen_test-jeos4 allocation score on h19: 0
> native_assign_node: prm_xen_test-jeos4 utilization on h16:
> utl_ram=2048 utl_cpu=20
> 
> Does that ring-shifting of resources look like a bug in pacemaker?
> 
> Regards,
> Ulrich

>From the above it's not apparent why fencing was needed.

It makes sense that things would move once the time-based rule kicked
in. Some event likely happened during the day that made the move
preferable, which might be difficult to find in the logs. Taking a few
random scheduler inputs from the day and simulating them with the time
set to the evening might help narrow down when it happened. Without kno
wing what happened, it's hard to say if the move makes sense.
-- 
Ken Gaillot <kgaillot at redhat.com>