[ClusterLabs] 2-Node Cluster Pointless?
Digimer
lists at alteeve.ca
Sun Apr 16 20:17:11 CEST 2017
On 16/04/17 01:53 PM, Eric Robinson wrote:
> I was reading in "Clusters from Scratch" where Beekhof states, "Some would argue that two-node clusters are always pointless, but that is an argument for another time." Is there a page or thread where this argument has been fleshed out? Most of my dozen clusters are 2 nodes. I hate to think they're pointless.
>
> --
> Eric Robinson
There is a belief that you can't build a reliable cluster without
quorum. I am of the mind that you *can* build a very reliable 2-node
cluster. In fact, every cluster our company has deployed, going back
over five years, has been 2-node and have had exception uptimes.
The confusion comes from the belief that quorum is required and stonith
is option. The reality is the opposite. I'll come back to this in a minute.
In a two-node cluster, you have two concerns;
1. If communication between the nodes fail, but both nodes are alive,
how do you avoid a split brain?
2. If you have a two node cluster and enable cluster startup on boot,
how do you avoid a fence loop?
Many answer #1 by saying "you need a quorum node to break the tie". In
some cases, this works, but only when all nodes are behaving in a
predictable manner.
Many answer #2 by saying "well, with three nodes, if a node boots and
can't talk to either other node, it is inquorate and won't do anything".
This is a valid mechanism, but it is not the only one.
So let me answer these from a 2-node perspective;
1. You use stonith and the faster node lives, the slower node dies. From
the moment of comms failure, the cluster blocks (needed with quorum,
too) and doesn't restore operation until the (slower) peer is in a known
state; Off. You can bias this by setting a fence delay against your
preferred node. So say node 1 is the node that normally hosts your
services, then you add 'delay="15"' to node 1's fence method. This tells
node 2 to wait 15 seconds before fencing node 1. If both nodes are
alive, node 2 will be fenced before the timer expires.
2. In Corosync v2+, there is a 'wait_for_all' option that tells a node
to not do anything until it is able to talk to the peer node. So in the
case of a fence after a comms break, the node that reboots will come up,
fail to reach the survivor node and do nothing more. Perfect.
Now let me come back to quorum vs. stonith;
Said simply; Quorum is a tool for when everything is working. Fencing is
a tool for when things go wrong.
Lets assume that your cluster is working find, then for whatever reason,
node 1 hangs hard. At the time of the freeze, it was hosting a virtual
IP and an NFS service. Node 2 declares node 1 lost after a period of
time and decides it needs to take over;
In the 3-node scenario, without stonith, node 2 reforms a cluster with
node 3 (quorum node), decides that it is quorate, starts its NFS server
and takes over the virtual IP. So far, so good... Until node 1 comes out
of its hang. At that moment, node 1 has no idea time has passed. It has
no reason to think "am I still quorate? Are my locks still valid?" It
just finishes whatever it was in the middle of doing and bam,
split-brain. At the least, you have two nodes claiming the same IP at
the same time. At worse, you had uncoordinated writes to shared storage
and you've corrupted your data.
In the 2-node scenario, with stonith, node 2 is always quorate, so after
declaring node 1 lost, it moves to fence node 1. Once node 1 is fenced,
*then* it starts NFS, takes over the virtual IP and restores services.
In this case, no split-brain is possible because node 1 has rebooted and
comes up with a fresh state (or it's on fire and never coming back anyway).
This is why quorum is optional and stonith/fencing is not.
Now, with this said, I won't say that 3+ node clusters are bad. They're
fine if they suit your use-case, but even with 3+ nodes you still must
use stonith.
My *personal* arguments in favour of 2-node clusters over 3+ nodes is this;
A cluster is not beautiful when there is nothing left to add. It is
beautiful when there is nothing left to take away.
In availability clustering, nothing should ever be more important than
availability, and availability is a product of simplicity. So in my
view, a 3-node cluster adds complexity that is avoidable, and so is
sub-optimal.
I'm happy to answer any questions you have on my comments/point of view
on this.
--
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certainty that people of equal talent
have lived and died in cotton fields and sweatshops." - Stephen Jay Gould
More information about the Users
mailing list