[Pacemaker] cluster-delay property

Thu Oct 24 10:06:20 EDT 2013

On 24/10/13 09:01, Michael Schwartzkopff wrote:
> Am Donnerstag, 24. Oktober 2013, 14:39:39 schrieb Karl Rößmann:
>> Sorry, I try to explain
>> 
>> Hi
>> 
>> In your book you describe a parameter 'deadtime' which defines
>> the timeout to declare a node as dead. I want to extend this
>> value to 120s to avoid such a scenario
>> 
>> But: in the SuSE documentation I cannot find 'deadtime', instead
>> I see a value 'cluster-delay'. My Question is: Are these two
>> parameters equivalent ?
>> 
>> More details about the scenario: The I/O load was created by me,
>> because I copied a large xen image to an logical volume of the
>> cLVM (using 'dd'). I did it several times before without
>> problems. Maybe something changed after upgrading tu SLES SP3.
>> 
>> One node, (it was the DC) died, the Xen resources went to the 
>> surviving node. Fine.
>> 
>> No information in the log file.
>> 
>> On the the surviving node I see: Oct 23 09:30:41 ha2infra
>> corosync[9085]:  [TOTEM ] A processor failed, forming new
>> configuration.
> (...)
> 
> the log says that corosync did not see the node. This is not a
> pacemaker problem.
> 
> I speculate that this happened because one node was heavily
> overloaded doing the dd and did not find to process the corosync
> tokens in time. Or perhaps the load on the network was so high that
> corosync packets were dropped.
> 
> Anyway: This is not a pacemaker problem, it is a corosync problem.
> 
> If you want to make corosync bahave a little bit more relaxed
> please see "man corosync.conf" for the options. Look for the
> options token and the following options. I don't know what options
> are available in SLES11 HAE3. corosync is under heavy improvement
> ;-)
> 
> If you have a question for a specific option please ask here on the
> list.

I agree with Michael that this is a corosync problem. I also agree
that this is a congestion problem. The variable you are looking for is
token_retransmit, if I am correct.

I would argue that the better solution is not to adjust this value,
but to fixed your architecture to separate corosync/pacemaker traffic
from the disk/dd traffic. If you increase token_retransmit, you will
delay how long real failures take to be detected, thus slowing down
recovery.

My personal preferred network configuration is to have three networks;
one dedicated to corosync (and other cluster traffic), a second for
storage (drbd/iscsi, dd, etc) and a third that is used by the
applications that you've made HA, like the VMs. I've found this setup
to work in production for many years and never break under high load.

digimer

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?