[ClusterLabs] Cluster node loss detection.

Fri Oct 16 18:34:55 UTC 2015

I retract the statement "Haven't seen split-brain yet".
We've seen it, but our application-specific answer has saved us so far.
I guess you could call it DSYITF.  Don't Shoot Yourself In The Foot.  When we have split-brain and a resource is started on both nodes, one of them usually says DSYITF and fails.

Regards.
Mark K Vallevand   Mark.Vallevand at Unisys.com <mailto:Mark.Vallevand at Unisys.com> 
Never try and teach a pig to sing: it's a waste of time, and it annoys the pig.

THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.

-----Original Message-----
From: Vallevand, Mark K [mailto:Mark.Vallevand at UNISYS.com] 
Sent: Friday, October 16, 2015 01:09 PM
To: Cluster Labs - All topics related to open-source clustering welcomed
Subject: Re: [ClusterLabs] Cluster node loss detection.

We know.  We've worked out our application-specific answer to split brain.  But, proper fencing is on our to-do list.
Currently we only deploy 2-node systems.  There is one application and its agent.  One resource is configured.  
We have this in cluster.conf
  <cman transport="udpu" two_node="1" expected_votes="1"> 
  </cman>
So, we don’t get quorum issues.
We are also experimenting with a second, redundant network for clustering use.  It works, but we aren't deploying yet.
Haven't seen split-brain yet, except in early, fumble-fingered experiments.  

Reading the tutorial.  Always interested in understanding more.  Thanks.

Regards.
Mark K Vallevand   Mark.Vallevand at Unisys.com <mailto:Mark.Vallevand at Unisys.com> 
Never try and teach a pig to sing: it's a waste of time, and it annoys the pig.

THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.

-----Original Message-----
From: Digimer [mailto:lists at alteeve.ca] 
Sent: Friday, October 16, 2015 12:35 PM
To: Cluster Labs - All topics related to open-source clustering welcomed
Subject: Re: [ClusterLabs] Cluster node loss detection.

On 16/10/15 01:14 PM, Vallevand, Mark K wrote:
> No stonith configured.  Not explicitly anyway.
> Does that factor into this somehow?

Yes, you will eventually have a split-brain.

All fencing in cman does with 'fence_pcmk' is say "hey, if you need to
fence, ask pacemaker to do it". That's useless if pacemaker can't fence.

> I've tested stonith, but we aren't doing it for customers.  Maybe in the future if someone cries or pays us money.
> Our solution is deployed onto too many different machines.  A couple of bare metal.  A couple of VMs.  We don't want customers to need to figure out stonith and we can't test all possible configurations and write instructions.  So, they get one-size-fits-all.

https://alteeve.ca/w/AN!Cluster_Tutorial_2#Concept.3B_Fencing

You are doing a disservice to your customers. Without fencing, you
*will* have a bad day, it's just a question of when. I can't tell you
how many times I've heard "but it worked fine for over a year!".

Stonith is worth the hassle.

> Regards.
> Mark K Vallevand   Mark.Vallevand at Unisys.com <mailto:Mark.Vallevand at Unisys.com> 
> Never try and teach a pig to sing: it's a waste of time, and it annoys the pig.
> 
> THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.
> 
> 
> -----Original Message-----
> From: Digimer [mailto:lists at alteeve.ca] 
> Sent: Friday, October 16, 2015 11:51 AM
> To: Cluster Labs - All topics related to open-source clustering welcomed
> Subject: Re: [ClusterLabs] Cluster node loss detection.
> 
> On 16/10/15 12:37 PM, Vallevand, Mark K wrote:
>> Fencing, yes.  I have pcmk-redirect for each node in cluster.conf.
> 
> Do you have stonith configured (and tested!) in Pacemaker as well?
> 
>> I run with default cman settings for corosync.  No totem clause.  That gives the 20s detection.  Not sure what the defaults really are.
>> I added <totem token="1000" token_retransmits_before_loss_const="5" /> to cluster.conf and get about a 5s detection.
>>
>> The corosync man page says:
>>        token  This timeout specifies in milliseconds until a token loss is declared after not receiving a token.  This is the time spent detecting a
>>               failure of a processor in the current configuration.  Reforming a new configuration takes about 50 milliseconds in  addition  to  this
>>               timeout.
>>
>>               The default is 1000 milliseconds.
>>
>>        token_retransmit
>>               This timeout specifies in milliseconds after how long before receiving a token the token is retransmitted.  This will be automatically
>>               calculated if token is modified.  It is not recommended to alter this value without guidance from the corosync community.
>>
>>               The default is 238 milliseconds.
>>
>>        hold   This timeout specifies in milliseconds how long the token should be held by the representative when the protocol is under low utiliza‐
>>               tion.   It is not recommended to alter this value without guidance from the corosync community.
>>
>>               The default is 180 milliseconds.
>>
>>        token_retransmits_before_loss_const
>>               This  value  identifies  how  many  token  retransmits  should be attempted before forming a new configuration.  If this value is set,
>>               retransmit and hold will be automatically calculated from retransmits_before_loss and token.
>>
>>               The default is 4 retransmissions.
>>
>> But, I don't know what cman sets these to.  But, they aren't these values.  And, they aren't the values in the cman man page, which says this:
> 
> Maybe it's changed by the ubuntu packagers? I don't know, I don't use
> debian or ubuntu.
> 
>>               Cman uses different defaults for some of the corosync parameters listed in corosync.conf(5).  If you wish to use a non-default set‐
>>               ting, they can be configured in cluster.conf as shown above.  Cman uses the following default values:
>>
>>                 <totem
>>                   vsftype="none"
>>                   token="10000"
>>                   token_retransmits_before_loss_const="20"
>>                   join="60"
>>                   consensus="4800"
>>                   rrp_mode="none"
>>                   <!-- or rrp_mode="active" if altnames are present >
>>                 />
>>                
>> So, it looks like setting the corosync parameters in cluster.conf has some effect.  Cman seems to pass them to corosync.
> 
> Yes, never configure corosync directly when using cman, only use
> cluster.conf, as you did.
> 
>> Onward!
>>
>>
>> Regards.
>> Mark K Vallevand   Mark.Vallevand at Unisys.com <mailto:Mark.Vallevand at Unisys.com> 
>> Never try and teach a pig to sing: it's a waste of time, and it annoys the pig.
>>
>> THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.
>>
>>
>> -----Original Message-----
>> From: Digimer [mailto:lists at alteeve.ca] 
>> Sent: Friday, October 16, 2015 11:18 AM
>> To: Cluster Labs - All topics related to open-source clustering welcomed
>> Subject: Re: [ClusterLabs] Cluster node loss detection.
>>
>> On 16/10/15 11:40 AM, Vallevand, Mark K wrote:
>>> Thanks.  I wasn't completely aware of corosync's role in this.  I see new things in the docs every time I read them.
>>>
>>> I looked up the corosync settings at one time and did it again:
>>> 	token loss 3000ms
>>> 	retransmits 10
>>> So 30s.  Redid my simple testing and got detection times of 22s, 26s, and 25s using very crude methods.
>>> Any warnings about setting these values to something else?
>>> We require our customers to use an isolated, private network for cluster communications.  All taken care of in our instructions and cluster configuration scripts.  Network traffic will not be a factor.  So, I'm thinking 1000ms and 5 retransmits as an experiment.
>>
>> That is very high. I think the default is something like 236ms x 4 losses.
>>
>> You do have fencing, right?
>>
>>> I was pretty sure that DLM was just being informed by clustering, but I needed to ask.
>>>
>>> Again, thanks.
>>> 	
>>>
>>> Regards.
>>> Mark K Vallevand   Mark.Vallevand at Unisys.com <mailto:Mark.Vallevand at Unisys.com> 
>>> Never try and teach a pig to sing: it's a waste of time, and it annoys the pig.
>>
>>
> 
> 

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?

_______________________________________________
Users mailing list: Users at clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

_______________________________________________
Users mailing list: Users at clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org