[ClusterLabs] Trying to understand dampening (ping)

Andrei Borzenkov arvidjaar at gmail.com
Sat Oct 16 03:04:10 EDT 2021


On 14.10.2021 23:51, martin doc wrote:
> 
> 
> ________________________________
> From: Andrei Borzenkov <arvidjaar at gmail.com>,  Friday, 15 October 2021 4:59 AM
> ...
>> Dampening defines delay before attributes are committed to CIB.
>> Private attributes are never ever written into CIB, so dampening
>> makes no sense here. Private attributes are managed by attrd
>> itself and you see the latest value.
> 
>> If you change transient attribute (without -p option) value you
>> will see different values reported by
> 
>> attrd_updater -n my_ping -Q
> 
>> and
> 
>> cibadmin -Q -A "//nvpair[@name='my_ping']"
> 
>> until dampening timeout expires.
> 
>> This applies even to deleting attribute.
> 
> Ok, now I understand what the dampen function does.
> 
> If I understand this correctly then this probably makes every documented example of using ocf:pacemaker:ping with a colocation statement wrong because the only way to see the effect of dampen is to use a rule that references the value of pingd directly. That or the script for ping has a major flaw with respect to dampen.
> 
> That is when I do this:
> 
> pcs resource create myPing ocf:pacemaker:ping host_list=192.168.1.1 failure_score=1
> pcs resource create database ocf:heartbeat:pgsql
> pcs group add pgrp myPing database
> 
> PCS will move everything to a new node if there is even 1 ping failure because monitor in ping doesn't look at the dampened value, only the value of the immediate returned value.
> 

failure_score is number of hosts that must answer ping during single
monitor invocation. If you have single host, the only meani
ngful value is 1.

If you want to smooth out single ping failure, use "attempts" parameter.
It defaults to 3, which means every monitor operation does 3 pings and
fails only if all of the fail. So it already does what you want without
any special configuration.

> The same is true with colocation statements - if a constraint is made with a ping resource without using a rule that references pingd then  the dampen behaviour is ignored completely.
> 

You completely misunderstand what dampen is used for. It is used to wait
for multiple nodes to record results of their monitor actions so when
policy engine is invoked it (hopefully) has final picture. It has
nothing to do with individual ping results on any single node.

> Is the ping'er missing something that does this:
> 
> score=`cibadmin -Q -A "//nvpair[@name='ping']" | sed -e 's/.*value="\([^"]*\)".*/\1/'`
> 

The only effect it will have will be using results of previous monitor
invocation instead of current one.

You cannot used dampening to smooth out ping results. You will still
have only one final value recorded, so in the sequence success, success,
failure it will be failure.

To do anything more sophisticated you need to actually record every
individual ping result. This is far more involved and I still miss real
use case.


More information about the Users mailing list