[ClusterLabs] NFS in different subnets

Sat Apr 18 15:39:45 EDT 2020

On 2020-04-18 2:48 a.m., Strahil Nikolov wrote:
> On April 18, 2020 8:43:51 AM GMT+03:00, Digimer <lists at alteeve.ca> wrote:
>> For what it's worth; A lot of HA specialists spent a lot of time trying
>> to find the simplest _reliable_ way to do multi-site/geo-replicated HA.
>> I am certain you'll find a simpler solution, but I would also wager
>> that
>> when it counts, it's going to let you down.
>>
>> The only way to make things simpler is to start making assumptions, and
>> if you do that, at some point you will end up with a split-brain (both
>> sites thinking the other is gone and trying to take the primary role)
>> or
>> both sites will think the other is running, and neither will be. Add
>> shared storage to the mix, and there's a high chance you will corrupt
>> data when you need it most.
>>
>> Of course, there's always a chance you'll come up with a system no one
>> else has thought of, just be aware of what you know and what you don't.
>> HA is fun, in big part, because it's a challenge to get right.
>>
>> digimer
>>
> 
> I don't get something.
> 
> Why this cannot be done?
> 
> One  node is in siteA, one in siteB , qnet on third location.Routing between the 2 subnets is established and symmetrical.
> Fencing via IPMI or  SBD (for  example from a HA iSCSI cluster) is  configured
> 
> The NFS resource is started on 1  node and a special RA is  used for the DNS records. If node1 dies, the cluster  will fence  it and node2  will  power up the NFS and update the records.
> 
> Of course, updating DNS only from 1  side must work for both sites.
> 
> Best Regards,
> Strahil Nikolov

It comes down to differentiating between a link loss to a site versus
the destruction/loss of the site. In either case, you can't fence the
lost node, so what do you do? If you decide that you don't need to fence
it, then you face all the issues of any other normal cluster with broken
or missing fencing. It's just a question of time before you assume wrong
and end up with a split brain / data divergence / data loss.

The reason that Booth has been designed the way it has solves this
problem by having "a cluster of clusters". If a site is lost because of
a comms break, you can trust the cluster at the site to act in a
predictable way. This is only possible because that site is a
self-contained HA cluster, so it can be confidently assumed that it will
shut down services when it loses contact with the peer and quorum sites.

The only safe way to operate without this setup over a stretch cluster
is to accept that a comms loss or site loss hangs the cluster until a
human intervenes, but then, that's not really HA now.

-- 
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certainty that people of equal talent
have lived and died in cotton fields and sweatshops." - Stephen Jay Gould