[Pacemaker] Split-brain after

Digimer linux at alteeve.com
Thu Aug 11 13:12:25 EDT 2011


On 08/11/2011 12:58 PM, Alex Forster wrote:
> I have a two node Pacemaker/Corosync cluster with no resources configured yet.
> I'm running RHEL 6.1 with the official 1.1.5-5.el6 package.
> 
> While doing various network configuration, I happened to notice that if I issue
> a "service network restart" on one node, then approx. four seconds later issue
> "service network restart" on the second node, the two nodes become split brain,
> each thinking the other is offline.
> 
> Obviously, issuing 'service network restarts' four seconds apart will not be a
> common occurrence in production, but it concerns me that I can 'trick' the nodes
> into becoming split-brain so easily. Is there some way I can configure Corosync
> to quickly recover from this scenario?
> 
> Alex

Configuring fence (stonith) will protect against split-brain by causing
the remote node to be forced offline (rough, but better than split-brain).

-- 
Digimer
E-Mail:              digimer at alteeve.com
Freenode handle:     digimer
Papers and Projects: http://alteeve.com
Node Assassin:       http://nodeassassin.org
"At what point did we forget that the Space Shuttle was, essentially,
a program that strapped human beings to an explosion and tried to stab
through the sky with fire and math?"




More information about the Pacemaker mailing list