[ClusterLabs] Node Fencing and STONITH in Amazon Web Services

Wed Aug 31 11:05:57 EDT 2016

On 08/29/2016 01:23 PM, Enno Gröper wrote:
> Hi,
> 
> Am 26.08.2016 um 22:32 schrieb Jason A Ramsey:
>> No users
>> would be connecting to the severed instances, but background and system
>> tasks would proceed as normal, potentially writing new data to the
>> databases making rejoining the nodes to the cluster a little bit tricky
>> to say the least, especially if the severed side’s network comes back up
>> and both systems come to realize that they’re not consistent.
> I think you forgot a potential routing problem. Network is not simply
> on/off.
> It could be, that you have a split brain scenario, where your 2 nodes
> don't see each other. But it is perfectly possible, that at the same
> time both nodes are seen by / interacting with some of your users
> (depending on their location/routing). So it aren't just some background
> tasks potentially writing data to the databases. It may be real user data.
> 
> I'm interested in this topic as this is a general cloud problem: To my
> knowledge you simply don't have any out-of-band messaging channel
> between your nodes to avoid split brain or make STONITH possible.
> At least in OpenStack (this is what I know), everything running on the
> node (vm) needs to go through client networking, which could be
> malfunctioning. Even if it is, in theory, possible to issue API calls to
> shutdown the other node. These calls would still need to go through the
> same messaging channel (client network).
> 
> A solution could be to use DBaaS (reliable database provided by cloud
> service provider). Don't know if any csp provides database replication
> across different sites.
> I simply don't see a way to reliably solve this using Pacemaker (without
> out-of-band messaging channel and/or reliable STONITH).
> Imho for DBaaS you need to look carefully at the SLA / specs to see, if
> your DBaaS really provides, what you want.
> 
> My 2 cents
> Enno

This is an interesting and unsettled area of HA.

The strict view is that clouds simply aren't suitable HA platforms --
the inability to have reliable fencing, and the potential for load from
other customers to interfere with corosync's ability to maintain
realtime coherence, make true HA impossible.

The broader view is that some people are willing to make trade-offs to
get "closer to HA", so can anything be done to minimize the problem
space? Any answer will have to address the two concerns above.

With fencing, there is a fence agent floating out there that uses the
AWS API, but I heard that it hasn't been kept up to date and doesn't
work as-is with current AWS. It should be feasible for someone to get
this in working shape again (and something similar with any cloud
provider that offers an API to forcibly to shut down an instance). Of
course, this assumes that the APIs are usable, but I think that's a
trade-off most people would be willing to make. If the nodes are
distributed across AWS zones (or whatever they're called), it would
probably be ideal to have at least two nodes in each zone (so each can
fence the other if it becomes unresponsive), and use booth, qdevice,
poison pill, or something similar to handle when multiple zones are up
but can't see each other.