[ClusterLabs] Antw: [EXT] Re: Question about ping nodes

Tue Apr 20 02:29:11 EDT 2021

>>> Ken Gaillot <kgaillot at redhat.com> schrieb am 19.04.2021 um 18:15 in
Nachricht
<027898e95bf80adafd0ea32fb81fe9a3a3814f5a.camel at redhat.com>:
> On Mon, 2021-04-19 at 15:02 +0000, Walker, Chris wrote:
>> I see the same behavior as Andrei.  On a two-node test cluster
>> (symmetric-cluster=true, resource-stickiness=100) I configure a
>> simple Dummy resource and add the rule
>>  
>>       <rsc_location id="dummy_loc" rsc="dummy">
>>         <rule score="-10000" id="dummy_loc-rule-0">
>>           <expression attribute="mgmt" operation="lt" value="1"
>> type="number" id="dummy_loc-rule-0-expression"/>
>>         </rule>
>>       </rsc_location>
>>  
>> If I then change the value of ‘mgmt’ to zero for both nodes, the
>> resource stops.  We have code much like Andrei outlined to work
>> around this behavior.
> 
> Well, this is a new one to me. I've confirmed that this is the
> intentional behavior, and the documentation is an oversimplification.
> 
> The documentation is partly correct: a -INFINITY score means the node
> is *never* eligible to run the resource, and a nonnegative score means
> the node *may* be eligible to run the resource under certain
> circumstances.
> 
> What it omits is what those circumstances are: when some other positive
> factor outweighs the negative score. For example, another constraint,
> or stickiness.

Another question is why one sets a stickiness if the resource should follow
the ping value.
Another idea is: Why not make a rule to _locate_ the resource where the ping
value is high (instead of forcing away the resource where the ping value is
low).

> 
> That does give an interesting configuration possibility: with a
> stickiness able to outweigh the location constraint, any node running a
> resource when connectivity is lost would be able to continue running
> the resource, but once the resource stopped, it couldn't be started
> again.
> 
> Anyway, a workaround to achieve the desired effect would be to give
> positive location constraints for the resource on all nodes, equal in
> dimension to the negative constraint score. Because the positive scores
> are all equal, they shouldn't affect which node is chosen, but any node
> that loses connectivity will have it score drop to 0 rather than
> negative, allowing it to run resources, while still preferring any node
> with connectivity.
> 
> I've filed a bug to update either the documentation or the behavior
> (I'm not sure which would be better at this point):
> 
>   https://bugs.clusterlabs.org/show_bug.cgi?id=5472 
> 
>> Thanks,
>> Chris
>>  
>> From: Users <users-bounces at clusterlabs.org>
>> Date: Monday, April 19, 2021 at 10:28 AM
>> To: Cluster Labs - All topics related to open-source clustering
>> welcomed <users at clusterlabs.org>
>> Subject: Re: [ClusterLabs] Question about ping nodes
>> 
>> On Sun, 2021-04-18 at 17:31 +0300, Andrei Borzenkov wrote:
>> > On 18.04.2021 08:41, Andrei Borzenkov wrote:
>> > > On 17.04.2021 22:41, Piotr Kandziora wrote:
>> > > > Hi,
>> > > > 
>> > > > Hope some guru will advise here ;)
>> > > > 
>> > > > I've got two nodes cluster with some resource placement
>> dependent
>> > > > on ping
>> > > > node visibility (
>> > > > 
>> 
>
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html

>
/high_availability_add-on_reference/s1-moving_resources_due_to_connectivity_c
> hanges-haar
>> > > > ).
>> > > > 
>> > > > Is it possible to do nothing with these resources when both
>> nodes
>> > > > do not
>> > > > have access to the ping node?
>> > > > 
>> > > > Currently, when the ping node is unavailable (node itself
>> becomes
>> > > > unavailable) both nodes stop the resources.
>> > > > 
>> > > 
>> > > Just use any negative score higher than -INFINITY in location
>> > > constraint.
>> > > 
>> > 
>> > No, it does not work. I was mislead by documentation (5.2.1
>> location
>> > properties):
>> 
>> Actually your initial interpretation was correct. Using a non-
>> infinite
>> negative score for the location constraint should work as expected --
>> the resource can still run there if there's no better place.
>> 
>> I'm not sure why you didn't see that in your test, maybe some other
>> factor prevented it from running?
>> 
>> BTW another solution to the initial problem would be to use multiple
>> IPs in the ping agent. It would be less likely for all of them to be
>> down at the same time without a network issue.
>> 
>> > 
>> > score:
>> > 
>> > Negative values indicate the resource(s) should avoid this node (a
>> > value
>> > of -INFINITY changes "should" to "must").
>> > 
>> > I interpreted "should" as "pacemaker will normally avoid this node
>> > but
>> > still may chose it if nothing better is possible". It is not what
>> > happens. Apparently negative score completely prevents assigning
>> > resource to this node, and "should" here probably means "it is
>> still
>> > possible that final score may become positive".
>> > 
>> > As it is not possible to refer to attributes of multiple nodes in a
>> > rule, you would need something that combines current pingd status
>> for
>> > individual nodes and makes it available. Logical place is
>> > ocf:pacemaker:ping resource agent itself.
>> _______________________________________________
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users 
>> 
>> ClusterLabs home: https://www.clusterlabs.org/ 
> -- 
> Ken Gaillot <kgaillot at redhat.com>
> 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/