[Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart?

Tue Sep 4 18:04:09 EDT 2012

On Tue, Aug 28, 2012 at 3:01 AM, Andrew Martin <amartin at xes-inc.com> wrote:
> Jake,
>
>
> Attached is the log from the same period for node2. If I am reading this correctly, it looks like there was a 7 second difference between when node1 set its score to 1000 and when node2 set its score to 1000?

Assuming the time is in sync on both nodes, yes.
This is somewhat expected since your monitor interval is 10s.

This is why we recommend dampen = 2 * monitor.
>From the next log, it looks like you're using 5s (the -d option) instead of 20s.

> Aug 22 10:40:38 node1 attrd_updater: [1860]: info: Invoked: attrd_updater -n p_ping -v 1000 -d 5s
> Aug 22 10:40:43 node1 attrd: [4402]: notice: attrd_trigger_update: Sending flush op to all hosts for: p_ping (1000)
> Aug 22 10:40:44 node1 attrd: [4402]: notice: attrd_perform_update: Sent update 265: p_ping=1000
>
> Aug 22 10:40:45 node2 attrd_updater: [27245]: info: Invoked: attrd_updater -n p_ping -v 1000 -d 5s
> Aug 22 10:40:50 node2 attrd: [4069]: notice: attrd_trigger_update: Sending flush op to all hosts for: p_ping (1000)
> Aug 22 10:40:50 node2 attrd: [4069]: notice: attrd_perform_update: Sent update 122: p_ping=1000
>
> I had changed the attempts value to 8 (from the default 2) to address this same issue - to avoid resource migration based on brief connectivity problems with these IPs - however if we can get dampen configured correctly I'll set it back to the default.
>
>
> Thanks,
>
>
> Andrew
>
> ----- Original Message -----
>
> From: "Jake Smith" <jsmith at argotec.com>
> To: "The Pacemaker cluster resource manager" <pacemaker at oss.clusterlabs.org>
> Sent: Monday, August 27, 2012 9:39:30 AM
> Subject: Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart?
>
>
> ----- Original Message -----
>> From: "Andrew Martin" <amartin at xes-inc.com>
>> To: "The Pacemaker cluster resource manager" <pacemaker at oss.clusterlabs.org>
>> Sent: Thursday, August 23, 2012 7:36:26 PM
>> Subject: Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart?
>>
>> Hi Florian,
>>
>>
>> Thanks for the suggestion. I gave it a try, but even with a dampen
>> value greater than 2* the monitoring interval the same behavior
>> occurred (pacemaker restarted the resources on the same node). Here
>> are my current ocf:pacemaker:ping settings:
>>
>> primitive p_ping ocf:pacemaker:ping \
>> params name="p_ping" host_list="192.168.0.128 192.168.0.129"
>> dampen="25s" multiplier="1000" attempts="8" debug="true" \
>> op start interval="0" timeout="60" \
>> op monitor interval="10s" timeout="60"
>>
>>
>> Any other ideas on what is causing this behavior? My understanding is
>> the above config tells the cluster to attempt 8 pings to each of the
>> IPs, and will assume that an IP is down if none of the 8 come back.
>> Thus, an IP would have to be down for more than 8 seconds to be
>> considered down. The dampen parameter tells the cluster to wait
>> before making any decision, so that if the IP comes back online
>> within the dampen period then no action is taken. Is this correct?
>>
>>
>
> I'm no expert on this either but I believe the dampen isn't long enough - I think what you say above is correct but not only does the IP need to come back online but the cluster must attempt to ping it successfully also. I would suggest trying dampen with greater than 3*monitor value.
>
> I don't think it's a problem but why change the attempts from the default 2 to 8?
>
>> Thanks,
>>
>>
>> Andrew
>>
>>
>> ----- Original Message -----
>>
>> From: "Florian Crouzat" <gentoo at floriancrouzat.net>
>> To: pacemaker at oss.clusterlabs.org
>> Sent: Thursday, August 23, 2012 3:57:02 AM
>> Subject: Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces
>> resources to restart?
>>
>> Le 22/08/2012 18:23, Andrew Martin a écrit :
>> > Hello,
>> >
>> >
>> > I have a 3 node Pacemaker + Heartbeat cluster (two real nodes and 1
>> > quorum node that cannot run resources) running on Ubuntu 12.04
>> > Server amd64. This cluster has a DRBD resource that it mounts and
>> > then runs a KVM virtual machine from. I have configured the
>> > cluster to use ocf:pacemaker:ping with two other devices on the
>> > network (192.168.0.128, 192.168.0.129), and set constraints to
>> > move the resources to the most well-connected node (whichever node
>> > can see more of these two devices):
>> >
>> > primitive p_ping ocf:pacemaker:ping \
>> > params name="p_ping" host_list="192.168.0.128 192.168.0.129"
>> > multiplier="1000" attempts="8" debug="true" \
>> > op start interval="0" timeout="60" \
>> > op monitor interval="10s" timeout="60"
>> > ...
>> >
>> > clone cl_ping p_ping \
>> > meta interleave="true"
>> >
>> > ...
>> > location loc_run_on_most_connected g_vm \
>> > rule $id="loc_run_on_most_connected-rule" p_ping: defined p_ping
>> >
>> >
>> > Today, 192.168.0.128's network cable was unplugged for a few
>> > seconds and then plugged back in. During this time, pacemaker
>> > recognized that it could not ping 192.168.0.128 and restarted all
>> > of the resources, but left them on the same node. My understanding
>> > was that since neither node could ping 192.168.0.128 during this
>> > period, pacemaker would do nothing with the resources (leave them
>> > running). It would only migrate or restart the resources if for
>> > example node2 could ping 192.168.0.128 but node1 could not (move
>> > the resources to where things are better-connected). Is this
>> > understanding incorrect? If so, is there a way I can change my
>> > configuration so that it will only restart/migrate resources if
>> > one node is found to be better connected?
>> >
>> > Can you tell me why these resources were restarted? I have attached
>> > the syslog as well as my full CIB configuration.
>> >
>
> As was said already the log shows node1 changed it's value for pingd to 1000, waited the 5 seconds of dampening and then started actions to move the resources. In the midst of stopping everything ping ran again successfully and the value increase back to 2000. This caused the policy engine to recalculate scores for all resources (before they had the chance to start on node2). I'm no scoring expert but I know there is additional value given to keep resources that are collocated together with their partners that are already running and resource stickiness to not move. So in this situation the score to stay/run on node1 once pingd was back at 2000 was greater that the score to move so things that were stopped or stopping restarted on node1. So increasing the dampen value should help/fix.
>
> Unfortunately you didn't include the log from node2 so we can't correlate what node2's pingd values are at the same times as node1. I believe if you look at the pingd values and times that movement is started between the nodes you will be able to make a better guess at how high a dampen value would make sure the nodes had the same pingd value *before* the dampen time ran out and that should prevent movement.
>
> HTH
>
> Jake
>
>> > Thanks,
>> >
>> > Andrew Martin
>> >
>>
>> This is an interesting question and I'm also interested in answers.
>>
>> I had the same observations, and there is also the case where the
>> monitor() aren't synced across all nodes so, "Node 1 issue a
>> monitor()
>> on the ping resource and finds ping-node dead, node2 hasn't pinged
>> yet,
>> so node1 moves things to node2 but node2 now issue a monitor() and
>> also
>> finds ping-node dead."
>>
>> The only solution I found was to adjust the dampen parameter to at
>> least
>> 2*monitor().interval so that I can be *sure* that all nodes have
>> issued
>> a monitor() and they all decreased they scores so that when a
>> decision
>> occurs, nothings move.
>>
>> It's been a long time I haven't tested, my cluster is very very
>> stable,
>> I guess I should retry to validate it's still a working trick.
>>
>> ====
>>
>> dampen (integer, [5s]): Dampening interval
>> The time to wait (dampening) further changes occur
>>
>> Eg:
>>
>> primitive ping-nq-sw-swsec ocf:pacemaker:ping \
>> params host_list="192.168.10.1 192.168.2.11 192.168.2.12"
>> dampen="35s" attempts="2" timeout="2" multiplier="100" \
>> op monitor interval="15s"
>>
>>
>>
>>
>> --
>> Cheers,
>> Florian Crouzat
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started:
>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started:
>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>