[Pacemaker] DRBD primary/primary + Pacemaker goes into split brain after crm node standby/online

Digimer lists at alteeve.ca
Mon Jun 9 23:28:48 EDT 2014


On 09/06/14 07:44 PM, Andrew Beekhof wrote:
>
> On 10 Jun 2014, at 4:07 am, Alexis de BRUYN <alexis.mailinglist at de-bruyn.fr> wrote:
>
>> Hi Everybody,
>>
>> I have an issue with a 2-node Debian Wheezy primary/primary DRBD
>> Pacemaker/Corosync configuration.
>>
>> After a 'crm node standby' then a 'crm node online', the DRBD volume
>> stays in a 'split brain state' (cs:StandAlone ro:Primary/Unknown).
>>
>> A soft or hard reboot of one node gets rid of the split brain and/or
>> doesn't create one.
>>
>> I have followed http://www.drbd.org/users-guide-8.3/ and keep my tests
>> as simple as possible (no activity and no filesystem on the DRBD volume).
>>
>> I don't see what I am doing wrong. Could anybody help me with this please.
>
> There could be a pacemaker bug.
> Master/slave resources are quite complex internally and have received many improvements in the years since 1.1.7.
> So simply upgrading pacemaker could be the answer.

In addition, setup/test stonith in pacemaker, then hook DRBD's fencing 
into pacemaker (set 'fencing resource-and-stonith;' and 'fence-handler 
/path/to/crm-fence-peer.sh). This way, if DRBD is about to split-brain, 
it will instead block and call a fence, and stay blocked until the fence 
succeeds. It will only resume when the peer is in a known state (off), 
thus avoiding split-brains entirely.

And, and Andrew said, upgrade pacemaker. :)

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without 
access to education?




More information about the Pacemaker mailing list