[ClusterLabs] Antw: [EXT] Re: Q: placement-strategy=balanced

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Tue Jan 19 02:11:01 EST 2021


>>> Ken Gaillot <kgaillot at redhat.com> schrieb am 18.01.2021 um 19:20 in
Nachricht
<06d171c5d33bcb20af71d534a94ce26a56bdd530.camel at redhat.com>:
> On Fri, 2021‑01‑15 at 09:36 +0100, Ulrich Windl wrote:
>> Hi!
>> 
>> The cluster I'm configuring (SLES15 SP2) fenced a node last night.
>> Still unsure what exactly caused the fencing, but looking at the logs
>> I found this "action plan" that lead to fencing:
>> 
>> Jan 14 20:05:12 h19 pacemaker‑schedulerd[4803]:  notice:  *
>> Move       prm_cron_snap_test‑jeos1              ( h18 ‑> h19 )
>> Jan 14 20:05:12 h19 pacemaker‑schedulerd[4803]:  notice:  *
>> Move       prm_cron_snap_test‑jeos2              ( h19 ‑> h16 )
>> Jan 14 20:05:12 h19 pacemaker‑schedulerd[4803]:  notice:  *
>> Move       prm_cron_snap_test‑jeos3              ( h16 ‑> h18 )
>> Jan 14 20:05:12 h19 pacemaker‑schedulerd[4803]:  notice:  *
>> Move       prm_cron_snap_test‑jeos4              ( h18 ‑> h19 )
>> Jan 14 20:05:12 h19 pacemaker‑schedulerd[4803]:  notice:  *
>> Migrate    prm_xen_test‑jeos1                    ( h18 ‑> h19 )
>> Jan 14 20:05:12 h19 pacemaker‑schedulerd[4803]:  notice:  *
>> Migrate    prm_xen_test‑jeos2                    ( h19 ‑> h16 )
>> Jan 14 20:05:12 h19 pacemaker‑schedulerd[4803]:  notice:  *
>> Migrate    prm_xen_test‑jeos3                    ( h16 ‑> h18 )
>> Jan 14 20:05:12 h19 pacemaker‑schedulerd[4803]:  notice:  *
>> Migrate    prm_xen_test‑jeos4                    ( h18 ‑> h19 )
>> 
>> Those "cron_snap" resources depend on the corresponding xen resources
>> (colocation).
>> Having 4 resources to be distributed equally to three nodes seems to
>> trigger that problem.
>> 
>> After fencing the action plan was:
>> 
>> Jan 14 20:05:26 h19 pacemaker‑schedulerd[4803]:  notice:  *
>> Move       prm_cron_snap_test‑jeos2              ( h16 ‑> h19 )
>> Jan 14 20:05:26 h19 pacemaker‑schedulerd[4803]:  notice:  *
>> Move       prm_cron_snap_test‑jeos4              ( h19 ‑> h16 )
>> Jan 14 20:05:26 h19 pacemaker‑schedulerd[4803]:  notice:  *
>> Start      prm_cron_snap_test‑jeos1              (             h18 )
>> Jan 14 20:05:26 h19 pacemaker‑schedulerd[4803]:  notice:  *
>> Start      prm_cron_snap_test‑jeos3              (             h19 )
>> Jan 14 20:05:26 h19 pacemaker‑schedulerd[4803]:  notice:  *
>> Recover    prm_xen_test‑jeos1                    ( h19 ‑> h18 )
>> Jan 14 20:05:26 h19 pacemaker‑schedulerd[4803]:  notice:  *
>> Migrate    prm_xen_test‑jeos2                    ( h16 ‑> h19 )
>> Jan 14 20:05:26 h19 pacemaker‑schedulerd[4803]:  notice:  *
>> Migrate    prm_xen_test‑jeos3                    ( h18 ‑> h19 )
>> Jan 14 20:05:26 h19 pacemaker‑schedulerd[4803]:  notice:  *
>> Migrate    prm_xen_test‑jeos4                    ( h19 ‑> h16 )
>> 
>> ...some more recoivery actions like that...
>> 
>> Currently h18 has two VMs, while the other two nodes have one VM
>> each.
>> 
>> Before having added those "cron_snap" resources, I did not detect
>> such "rebalancing".
>> 
>> The rebalancing was triggered by this ruleset present in every xen
>> resource:
>> 
>>         meta 1: resource‑stickiness=0 \
>>         meta 2: rule 0: date spec hours=7‑19 weekdays=1‑5 resource‑
>> stickiness=1000
>> 
>> At the moment the related scores (crm_simulate ‑LUs) look like this
>> (filtered and re‑ordered):
>> 
>> Original: h16 capacity: utl_ram=231712 utl_cpu=440
>> Original: h18 capacity: utl_ram=231712 utl_cpu=440
>> Original: h19 capacity: utl_ram=231712 utl_cpu=440
>> 
>> Remaining: h16 capacity: utl_ram=229664 utl_cpu=420
>> Remaining: h18 capacity: utl_ram=227616 utl_cpu=400
>> Remaining: h19 capacity: utl_ram=229664 utl_cpu=420
>> 
>> pcmk__native_allocate: prm_xen_test‑jeos1 allocation score on h16: 0
>> pcmk__native_allocate: prm_xen_test‑jeos1 allocation score on h18:
>> 1000
>> pcmk__native_allocate: prm_xen_test‑jeos1 allocation score on h19:
>> ‑INFINITY
>> native_assign_node: prm_xen_test‑jeos1 utilization on h18:
>> utl_ram=2048 utl_cpu=20
>> 
>> pcmk__native_allocate: prm_xen_test‑jeos2 allocation score on h16: 0
>> pcmk__native_allocate: prm_xen_test‑jeos2 allocation score on h18:
>> 1000
>> pcmk__native_allocate: prm_xen_test‑jeos2 allocation score on h19: 0
>> native_assign_node: prm_xen_test‑jeos2 utilization on h18:
>> utl_ram=2048 utl_cpu=20
>> 
>> pcmk__native_allocate: prm_xen_test‑jeos3 allocation score on h16: 0
>> pcmk__native_allocate: prm_xen_test‑jeos3 allocation score on h18: 0
>> pcmk__native_allocate: prm_xen_test‑jeos3 allocation score on h19:
>> 1000
>> native_assign_node: prm_xen_test‑jeos3 utilization on h19:
>> utl_ram=2048 utl_cpu=20
>> 
>> pcmk__native_allocate: prm_xen_test‑jeos4 allocation score on h16:
>> 1000
>> pcmk__native_allocate: prm_xen_test‑jeos4 allocation score on h18: 0
>> pcmk__native_allocate: prm_xen_test‑jeos4 allocation score on h19: 0
>> native_assign_node: prm_xen_test‑jeos4 utilization on h16:
>> utl_ram=2048 utl_cpu=20
>> 
>> Does that ring‑shifting of resources look like a bug in pacemaker?
>> 
>> Regards,
>> Ulrich
> 
> From the above it's not apparent why fencing was needed.

Hi Ken!

Forget about the fencing; it was a "standard error": Add the resources
required in the natural order, but forget to add ordering constraints to
prevent the cluster to start and stop everything at the same time ;-) Basically
it was the missing dependency that OCFS2 runs on a MD-RAID. Stopping the RAID
while OCFS2 ist still mounted is not a good idea...

> 
> It makes sense that things would move once the time‑based rule kicked
> in. Some event likely happened during the day that made the move
> preferable, which might be difficult to find in the logs. Taking a few
> random scheduler inputs from the day and simulating them with the time
> set to the evening might help narrow down when it happened. Without kno
> wing what happened, it's hard to say if the move makes sense.

I agree, but still it looked very much like a "ring shift" that made me
wonder.

For a while the cluster did not feel like that; to resources are stable (One
VM on h16, two VMs on h18, one VM on h19).
However in the end the cluster will have some more VMs.

Regards,
Ulrich

> ‑‑ 
> Ken Gaillot <kgaillot at redhat.com>
> 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 





More information about the Users mailing list