[ClusterLabs] Two-Node OCFS2 cluster keep rebooting each other

Wed Jun 10 06:27:25 UTC 2015

On 10/06/15 01:50 AM, Jonathan Vargas wrote:
> 
> 2015-06-09 23:26 GMT-06:00 Digimer <lists at alteeve.ca
> <mailto:lists at alteeve.ca>>:
> 
>     On 10/06/15 01:19 AM, Jonathan Vargas wrote:
>     > Thanks Andrei, Digimer.
>     >
>     > I see. Since I need to address this discussion to a definitive solution,
>     > I am sharing you a diagram of how we are designing this HA architecture,
>     > to clarify the problem we are trying to solve:
>     >
>     > http://i.imgur.com/BFPcZSx.png
> 
>     Last block is DRBD. If DRBD will be managed by the cluster, it must have
>     fencing.
> 
>     This is your definitive answer.
> 
>     Without it, you *will* get a split-brain. That leads to, at best, data
>     divergence or data loss.
> 
>     > The first layer, Load Balancer; and the third later, Database, are both
>     > already setup. The Load Balancer cluster uses only an VIP resource,
>     > while Database cluster uses DRBD+VIP resources. They are on production
>     > and work fine, test passed :-)
>     >
>     > Now we are handling the Web Server layer, which I am discussing with
>     > experts like you. These servers require to be all active and see the
>     > same data for read & write, as quickly as possible, mainly reads.
>     >
>     > *So, If we stay with OCFS2: *Since we need to protect the service
>     > availability and keep most of nodes up, what choices do I have to avoid
>     > reboots on both Web nodes caused by a split-brain situation when one of
>     > them is disconnected from network?
> 
>     None of this matters relative to the importance of working, tested
>     fencing for replicated storage.
> 
>     In any HA setup, the reboot of a node should matter not. If you are
>     afraid of rebooting a node, you need to reconsider your design.
> 
> 
> 
> Well, the problem is caused by a pretty common scenario: A simple
> network disconnection on node 1 causes both nodes to reboot, even when
> the node 1 is still offline, it will keep rebooting the active node 2.
> There were no disk issues, but the service availability was lost.
> *That's the main complain now :-/*

This is a symptom of a configuration issue. It is a separate topic for
using/not using fencing.

First, don't start the cluster when the node boots.

A node will boot for one of two reasons only;

1. Node was fenced; You don't want it back into the cluster until you
know it is safe to do so.

2. Scheduled maintenance; A human is there, so rejoining it after the
maintenance is over is a non-issue.

This solves the fence-on-boot issue. Also, corosync's wait_for_all
should be used to further protect against this.

If the problem is that both fence before they die, then set a delay
against a node to give it a head-start in fencing the peer. I find
delay="15" to be a good value.

>     > Correct me if I'm wrong:
>     >
>     > *1. Redundant Channel:* This is pretty difficult, since we would
>     have to
>     > add two new physical netword cards to the virtual machine hosts, and
>     > that changes network configuration a lot in the virtualization platform.
> 
>     High Availability must put priorities like hassle and cost second to
>     what makes a system more resilient. If you choose not to spend the extra
>     money or time, then you must accept the risks.
> 
> 
>     > *2. Three Node Cluster:* This is possible, but it will consume more
>     > resources. We can have it only for cluster communication though, not for
>     > web processing, that will decrease load.
> 
>     Quorum is NOT a substitution for fencing. They solve different problems.
> 
>     Quorum is a tool for when all nodes are behaving properly. Fencing is a
>     tool for when a node is not behaving properly.
> 
> 
> 
> Yes, but by adding a 3rd node, it will help to determine which node
> could be failing and which are not, to fence the proper one. Right?

If you have a 3rd node and you fail the network on one, then in theory,
yes it will help. In practice, if you down the network on one node, it
won't be able to fence the other node anyway and will be the fence victim.

>     > *3. Disable Fencing:* You said this should not happen at all if we
>     use a
>     > shared disk like OCFS. So I am discarding it.
> 
>     Correct.
> 
>     > *4. Use NFS: *Yes, this will cause a SPoF, and to solve it we
>     would have
>     > to setup another cluster with DRBD as described here
>     >
>     <https://www.suse.com/documentation/sle_ha/singlehtml/book_sleha_techguides/book_sleha_techguides.html>,
>     > and add more infrastructure resources, or do we can setup NFS over OCFS2?
> 
>     ... Which would require fencing anyway, so you gain nothing but another
>     layer of things to break. First rule of HA; Keep it simple.
> 
>     Complexity is the enemy of availability.
> 
> 
> 
> Sure, fencing must be added to if this would be the case.

Fencing is always needed in HA clusters, full stop.

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?