[ClusterLabs] Antw: [EXT] Re: Q: placement-strategy=balanced
Ulrich Windl
Ulrich.Windl at rz.uni-regensburg.de
Tue Jan 19 02:11:01 EST 2021
>>> Ken Gaillot <kgaillot at redhat.com> schrieb am 18.01.2021 um 19:20 in
Nachricht
<06d171c5d33bcb20af71d534a94ce26a56bdd530.camel at redhat.com>:
> On Fri, 2021‑01‑15 at 09:36 +0100, Ulrich Windl wrote:
>> Hi!
>>
>> The cluster I'm configuring (SLES15 SP2) fenced a node last night.
>> Still unsure what exactly caused the fencing, but looking at the logs
>> I found this "action plan" that lead to fencing:
>>
>> Jan 14 20:05:12 h19 pacemaker‑schedulerd[4803]: notice: *
>> Move prm_cron_snap_test‑jeos1 ( h18 ‑> h19 )
>> Jan 14 20:05:12 h19 pacemaker‑schedulerd[4803]: notice: *
>> Move prm_cron_snap_test‑jeos2 ( h19 ‑> h16 )
>> Jan 14 20:05:12 h19 pacemaker‑schedulerd[4803]: notice: *
>> Move prm_cron_snap_test‑jeos3 ( h16 ‑> h18 )
>> Jan 14 20:05:12 h19 pacemaker‑schedulerd[4803]: notice: *
>> Move prm_cron_snap_test‑jeos4 ( h18 ‑> h19 )
>> Jan 14 20:05:12 h19 pacemaker‑schedulerd[4803]: notice: *
>> Migrate prm_xen_test‑jeos1 ( h18 ‑> h19 )
>> Jan 14 20:05:12 h19 pacemaker‑schedulerd[4803]: notice: *
>> Migrate prm_xen_test‑jeos2 ( h19 ‑> h16 )
>> Jan 14 20:05:12 h19 pacemaker‑schedulerd[4803]: notice: *
>> Migrate prm_xen_test‑jeos3 ( h16 ‑> h18 )
>> Jan 14 20:05:12 h19 pacemaker‑schedulerd[4803]: notice: *
>> Migrate prm_xen_test‑jeos4 ( h18 ‑> h19 )
>>
>> Those "cron_snap" resources depend on the corresponding xen resources
>> (colocation).
>> Having 4 resources to be distributed equally to three nodes seems to
>> trigger that problem.
>>
>> After fencing the action plan was:
>>
>> Jan 14 20:05:26 h19 pacemaker‑schedulerd[4803]: notice: *
>> Move prm_cron_snap_test‑jeos2 ( h16 ‑> h19 )
>> Jan 14 20:05:26 h19 pacemaker‑schedulerd[4803]: notice: *
>> Move prm_cron_snap_test‑jeos4 ( h19 ‑> h16 )
>> Jan 14 20:05:26 h19 pacemaker‑schedulerd[4803]: notice: *
>> Start prm_cron_snap_test‑jeos1 ( h18 )
>> Jan 14 20:05:26 h19 pacemaker‑schedulerd[4803]: notice: *
>> Start prm_cron_snap_test‑jeos3 ( h19 )
>> Jan 14 20:05:26 h19 pacemaker‑schedulerd[4803]: notice: *
>> Recover prm_xen_test‑jeos1 ( h19 ‑> h18 )
>> Jan 14 20:05:26 h19 pacemaker‑schedulerd[4803]: notice: *
>> Migrate prm_xen_test‑jeos2 ( h16 ‑> h19 )
>> Jan 14 20:05:26 h19 pacemaker‑schedulerd[4803]: notice: *
>> Migrate prm_xen_test‑jeos3 ( h18 ‑> h19 )
>> Jan 14 20:05:26 h19 pacemaker‑schedulerd[4803]: notice: *
>> Migrate prm_xen_test‑jeos4 ( h19 ‑> h16 )
>>
>> ...some more recoivery actions like that...
>>
>> Currently h18 has two VMs, while the other two nodes have one VM
>> each.
>>
>> Before having added those "cron_snap" resources, I did not detect
>> such "rebalancing".
>>
>> The rebalancing was triggered by this ruleset present in every xen
>> resource:
>>
>> meta 1: resource‑stickiness=0 \
>> meta 2: rule 0: date spec hours=7‑19 weekdays=1‑5 resource‑
>> stickiness=1000
>>
>> At the moment the related scores (crm_simulate ‑LUs) look like this
>> (filtered and re‑ordered):
>>
>> Original: h16 capacity: utl_ram=231712 utl_cpu=440
>> Original: h18 capacity: utl_ram=231712 utl_cpu=440
>> Original: h19 capacity: utl_ram=231712 utl_cpu=440
>>
>> Remaining: h16 capacity: utl_ram=229664 utl_cpu=420
>> Remaining: h18 capacity: utl_ram=227616 utl_cpu=400
>> Remaining: h19 capacity: utl_ram=229664 utl_cpu=420
>>
>> pcmk__native_allocate: prm_xen_test‑jeos1 allocation score on h16: 0
>> pcmk__native_allocate: prm_xen_test‑jeos1 allocation score on h18:
>> 1000
>> pcmk__native_allocate: prm_xen_test‑jeos1 allocation score on h19:
>> ‑INFINITY
>> native_assign_node: prm_xen_test‑jeos1 utilization on h18:
>> utl_ram=2048 utl_cpu=20
>>
>> pcmk__native_allocate: prm_xen_test‑jeos2 allocation score on h16: 0
>> pcmk__native_allocate: prm_xen_test‑jeos2 allocation score on h18:
>> 1000
>> pcmk__native_allocate: prm_xen_test‑jeos2 allocation score on h19: 0
>> native_assign_node: prm_xen_test‑jeos2 utilization on h18:
>> utl_ram=2048 utl_cpu=20
>>
>> pcmk__native_allocate: prm_xen_test‑jeos3 allocation score on h16: 0
>> pcmk__native_allocate: prm_xen_test‑jeos3 allocation score on h18: 0
>> pcmk__native_allocate: prm_xen_test‑jeos3 allocation score on h19:
>> 1000
>> native_assign_node: prm_xen_test‑jeos3 utilization on h19:
>> utl_ram=2048 utl_cpu=20
>>
>> pcmk__native_allocate: prm_xen_test‑jeos4 allocation score on h16:
>> 1000
>> pcmk__native_allocate: prm_xen_test‑jeos4 allocation score on h18: 0
>> pcmk__native_allocate: prm_xen_test‑jeos4 allocation score on h19: 0
>> native_assign_node: prm_xen_test‑jeos4 utilization on h16:
>> utl_ram=2048 utl_cpu=20
>>
>> Does that ring‑shifting of resources look like a bug in pacemaker?
>>
>> Regards,
>> Ulrich
>
> From the above it's not apparent why fencing was needed.
Hi Ken!
Forget about the fencing; it was a "standard error": Add the resources
required in the natural order, but forget to add ordering constraints to
prevent the cluster to start and stop everything at the same time ;-) Basically
it was the missing dependency that OCFS2 runs on a MD-RAID. Stopping the RAID
while OCFS2 ist still mounted is not a good idea...
>
> It makes sense that things would move once the time‑based rule kicked
> in. Some event likely happened during the day that made the move
> preferable, which might be difficult to find in the logs. Taking a few
> random scheduler inputs from the day and simulating them with the time
> set to the evening might help narrow down when it happened. Without kno
> wing what happened, it's hard to say if the move makes sense.
I agree, but still it looked very much like a "ring shift" that made me
wonder.
For a while the cluster did not feel like that; to resources are stable (One
VM on h16, two VMs on h18, one VM on h19).
However in the end the cluster will have some more VMs.
Regards,
Ulrich
> ‑‑
> Ken Gaillot <kgaillot at redhat.com>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
More information about the Users
mailing list