[ClusterLabs] Antw: Re: 2-Node Cluster Pointless?

Tue Apr 18 07:47:10 UTC 2017

>>> Digimer <lists at alteeve.ca> schrieb am 16.04.2017 um 20:17 in Nachricht
<12cde13f-8bad-a2f1-6834-960ff3afce6c at alteeve.ca>:
> On 16/04/17 01:53 PM, Eric Robinson wrote:
>> I was reading in "Clusters from Scratch" where Beekhof states, "Some would

> argue that two-node clusters are always pointless, but that is an argument 
> for another time." Is there a page or thread where this argument has been 
> fleshed out? Most of my dozen clusters are 2 nodes. I hate to think they're

> pointless.  
>> 
>> --
>> Eric Robinson
> 
> There is a belief that you can't build a reliable cluster without
> quorum. I am of the mind that you *can* build a very reliable 2-node
> cluster. In fact, every cluster our company has deployed, going back
> over five years, has been 2-node and have had exception uptimes.
> 
> The confusion comes from the belief that quorum is required and stonith
> is option. The reality is the opposite. I'll come back to this in a minute.
> 
> In a two-node cluster, you have two concerns;
> 
> 1. If communication between the nodes fail, but both nodes are alive,
> how do you avoid a split brain?

By killing one of the two parties.

> 
> 2. If you have a two node cluster and enable cluster startup on boot,
> how do you avoid a fence loop?

I think the problem in the question is using "you" instead of "it" ;-)
Pacemaker assumes all problems that cause STONITH will be solved by STONITH.
That's not always true (e.g. configuration errors). Maybe a node's failcount
should not be reset if the node was fenced.
So you'll avoid a fencing loop, but might end in a state where no resources
are running. IMHO I'd prefer that over a fencing loop.

> 
> Many answer #1 by saying "you need a quorum node to break the tie". In
> some cases, this works, but only when all nodes are behaving in a
> predictable manner.

All software relies on the fact that it behaves in a predictable manner, BTW.
The problem is not "the predictable manner for all nodes", but the predictable
manner for the cluster.

> 
> Many answer #2 by saying "well, with three nodes, if a node boots and
> can't talk to either other node, it is inquorate and won't do anything".

"wan't do anything" is also wrong: I must go offline without killing others,
preferrably.

> This is a valid mechanism, but it is not the only one.
> 
> So let me answer these from a 2-node perspective;
> 
> 1. You use stonith and the faster node lives, the slower node dies. From

Isn't there a possibility that both nodes shoot each other? Is there a
guarantee that there will always be one faster node?

> the moment of comms failure, the cluster blocks (needed with quorum,
> too) and doesn't restore operation until the (slower) peer is in a known
> state; Off. You can bias this by setting a fence delay against your
> preferred node. So say node 1 is the node that normally hosts your
> services, then you add 'delay="15"' to node 1's fence method. This tells
> node 2 to wait 15 seconds before fencing node 1. If both nodes are
> alive, node 2 will be fenced before the timer expires.

Can only the DC issue fencing?

> 
> 2. In Corosync v2+, there is a 'wait_for_all' option that tells a node
> to not do anything until it is able to talk to the peer node. So in the
> case of a fence after a comms break, the node that reboots will come up,
> fail to reach the survivor node and do nothing more. Perfect.

Does "do nothing more" mean continuously polling for other nodes?

> 
> Now let me come back to quorum vs. stonith;
> 
> Said simply; Quorum is a tool for when everything is working. Fencing is
> a tool for when things go wrong.

I'd say: Quorum is the tool to decide who'll be alive and who's going to die,
and STONITH is the tool to make nodes die. If everything is working you need
neither quorum nor STONITH.

> 
> Lets assume that your cluster is working find, then for whatever reason,
> node 1 hangs hard. At the time of the freeze, it was hosting a virtual
> IP and an NFS service. Node 2 declares node 1 lost after a period of
> time and decides it needs to take over;

In case node 1 is DC, isn't a selection for a new DC coming first, and the new
DC doing the STONITH?

> 
> In the 3-node scenario, without stonith, node 2 reforms a cluster with
> node 3 (quorum node), decides that it is quorate, starts its NFS server
> and takes over the virtual IP. So far, so good... Until node 1 comes out

Again if node 1 was DC, it's not that simple.

> of its hang. At that moment, node 1 has no idea time has passed. It has

You assume no fencing was done...

> no reason to think "am I still quorate? Are my locks still valid?" It
> just finishes whatever it was in the middle of doing and bam,
> split-brain. At the least, you have two nodes claiming the same IP at
> the same time. At worse, you had uncoordinated writes to shared storage
> and you've corrupted your data.

But that's no cluster; that's a mess ;-)

> 
> In the 2-node scenario, with stonith, node 2 is always quorate, so after
> declaring node 1 lost, it moves to fence node 1. Once node 1 is fenced,
> *then* it starts NFS, takes over the virtual IP and restores services.

So you compare "2 nodes + fencing" to "3 nodes without fencing"?

> In this case, no split-brain is possible because node 1 has rebooted and
> comes up with a fresh state (or it's on fire and never coming back anyway).
> 
> This is why quorum is optional and stonith/fencing is not.

You did not convince me how only one node has the ability to fence the other
without a quorum: Wouldn't both nodes shoot at each other? (I quoted this so
many times, but once again: In HP-UX Service Guard, a lock disk was used as a
tie-breaker: Only one node suceeded to get the lock, and the other committed
suicide (via kernel watchdog timeout)).

> 
> Now, with this said, I won't say that 3+ node clusters are bad. They're
> fine if they suit your use-case, but even with 3+ nodes you still must
> use stonith.
> 
> My *personal* arguments in favour of 2-node clusters over 3+ nodes is this;

Again: You compare "2 nodes with fencing" to "3 nodes without fencing". My
personal vote would be "3 nodes with fencing" if there is enough work for two
nodes.

> 
> A cluster is not beautiful when there is nothing left to add. It is
> beautiful when there is nothing left to take away.
> 
> In availability clustering, nothing should ever be more important than
> availability, and availability is a product of simplicity. So in my
> view, a 3-node cluster adds complexity that is avoidable, and so is
> sub-optimal.

IMHO: a valid cluster software works starting at 1 node, then per induction
also for n+1 nodes. Complexity should grow only linear with the number of
nodes. Of course you shouldn't add nodes just for the number of nodes, but for
the actual need.

Regards,
Ulrich

> 
> I'm happy to answer any questions you have on my comments/point of view
> on this.
> 
> -- 
> Digimer
> Papers and Projects: https://alteeve.com/w/ 
> "I am, somehow, less interested in the weight and convolutions of
> Einstein’s brain than in the near certainty that people of equal talent
> have lived and died in cotton fields and sweatshops." - Stephen Jay Gould
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org 
> http://lists.clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org