[Pacemaker] DRBD primary/primary + Pacemaker goes into split brain after crm node standby/online

Alexis de BRUYN alexis.mailinglist at de-bruyn.fr
Wed Jun 11 10:13:18 EDT 2014


On 10.06.2014 01:44, Andrew Beekhof wrote:
> 
> On 10 Jun 2014, at 4:07 am, Alexis de BRUYN <alexis.mailinglist at de-bruyn.fr> wrote:
> 
>> Hi Everybody,
>>
>> I have an issue with a 2-node Debian Wheezy primary/primary DRBD
>> Pacemaker/Corosync configuration.
>>
>> After a 'crm node standby' then a 'crm node online', the DRBD volume
>> stays in a 'split brain state' (cs:StandAlone ro:Primary/Unknown).
>>
>> A soft or hard reboot of one node gets rid of the split brain and/or
>> doesn't create one.
>>
>> I have followed http://www.drbd.org/users-guide-8.3/ and keep my tests
>> as simple as possible (no activity and no filesystem on the DRBD volume).
>>
>> I don't see what I am doing wrong. Could anybody help me with this please.
> 
> There could be a pacemaker bug.  
> Master/slave resources are quite complex internally and have received many improvements in the years since 1.1.7.
> So simply upgrading pacemaker could be the answer.

Hi Andrew,

I have followed your advice and updated Pacemaker/Corosync by installing
a fresh Debian Sid but I still have the issue with the following packages:

# uname -a
Linux testvm1 3.13-1-amd64 #1 SMP Debian 3.13.10-1 (2014-04-15) x86_64
GNU/Linux

# cat /etc/issue && dpkg -l | egrep "corosync|pacemaker|drbd"
Debian GNU/Linux jessie/sid \n \l

ii  corosync                       1.4.6-1                     amd64
    Standards-based cluster framework (daemon and modules)
ii  crmsh                          1.2.6+git+e77add-1.2        amd64
    CRM shell for the pacemaker cluster manager
ii  drbd8-utils                    2:8.4.4-1                   amd64
    RAID 1 over TCP/IP for Linux (user utilities)
ii  pacemaker                      1.1.10+git20130802-4        amd64
    HA cluster resource manager
ii  pacemaker-cli-utils            1.1.10+git20130802-4        amd64
    Command line interface utilities for Pacemaker

And with the "experimental" packages, I cannot connect to the cluster
via crmsh too:

# cat /etc/issue && dpkg -l | egrep "corosync|pacemaker|drbd"
Debian GNU/Linux jessie/sid \n \l

ii  corosync                       2.3.3-1                     amd64
    Standards-based cluster framework (daemon and modules)
ii  crmsh                          1.2.6+git+e77add-1.2        amd64
    CRM shell for the pacemaker cluster manager
ii  drbd8-utils                    2:8.4.4-1                   amd64
    RAID 1 over TCP/IP for Linux (user utilities)
ii  libcorosync-common4            2.3.3-1                     amd64
    Standards-based cluster framework, common library
ii  pacemaker                      1.1.11-1                    amd64
    HA cluster resource manager
ii  pacemaker-cli-utils            1.1.11-1                    amd64
    Command line interface utilities for Pacemaker

I will try to build last versions of Pacemaker/Corosync on a Debian
Wheezy before reporting my issue via Bugzilla.

Thanks for your help.


-- 
Alexis de BRUYN




More information about the Pacemaker mailing list