[ClusterLabs] Recovering after split-brain

Digimer lists at alteeve.ca
Tue Jun 21 13:41:59 EDT 2016


On 21/06/16 01:27 PM, Dimitri Maziuk wrote:
> On 06/21/2016 12:13 PM, Andrei Borzenkov wrote:
> 
>> You should not run pacemaker without some sort of fencing. This need not
>> be network-controlled power socket (and tiebreaker is not directly
>> related to fencing).
> 
> Yes it can be sysadmin-controlled power socket. It has to be a power
> socket, if you don't trust me, read Dejan's list of fencing devices.

You can now use redundant and complex fencing configurations in pacemaker.

Our company always has this setup;

IPMI is the primary fence method (when it works, we can trust 'off'
100%, but it draws power from the host and is thus vulnerable)

Pair of switched PDUs as backup fencing (when it works, you are
confident that the outlets are opened, but you have to make sure the
cables are in the right place. However, it is entirely external to the
target).

> Tiebreaking is directly related to figuring out which of the two nodes
> is to be fenced. because neither of them can tell on its own.

See my comment on 'delay="15"'. You do NOT need a 3 node cluter/tie
breaker. We've run nothing but 2-node clusters for years all over north
america and we've heard of people running our system globally. With the
above fence setup and proper delay, it has never once been a problem.

>> I fail to see how heartbeat makes any difference here, sorry.
> 
> Third node and remote-controlled PDU were not a requirement for
> haresources mode. If I wanted to run it so that when it breaks I get to
> keep the pieces, I could.

You technically can in pacemaker, too, but it's dumb in any HA
environment. As soon as you make assumptions, you open up the chance of
being wrong.

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?




More information about the Users mailing list