[Pacemaker] Getting split brain after all reboot of a cluster node

Anne Nicolas ennael1 at gmail.com
Wed Mar 5 03:28:47 EST 2014


I'm having trouble setting a very simple cluster with 2 nodes. After all
reboot I'm getting split brain that I have to solve by hand then.
Looking for a solution for that one...

Both nodes have 4 network interfaces. We use 3 of them: one for an IP
cluster, one for a bridge for a vm and the last one for the private
network of the cluster

I'm using
drbd : 8.3.9
drbd-utils: 8.3.9

DRBD configuration:
$ cat global_common.conf
global {
        usage-count no;
common { syncer { rate 500M; } }

cat server.res
resource server {
        protocol C;
        net {
                 cram-hmac-alg sha1;
                 shared-secret "eafcupps";
 on dzacupsvr {
    device     /dev/drbd0;
    disk       /dev/vg0/server;
    flexible-meta-disk  internal;
  on dzacupsvr2 {
    device     /dev/drbd0;
    disk       /dev/vg0/server;
    flexible-meta-disk  internal;

Pacemaker configuration
node $id="16847020" dzacupsvr
node $id="33624236" dzacupsvr2
primitive apache ocf:heartbeat:apache \
        params configfile="/etc/httpd/conf/httpd.conf" \
        op start interval="0" timeout="40s" \
        op stop interval="0" timeout="60s"
primitive clusterip ocf:heartbeat:IPaddr2 \
        params ip="" cidr_netmask="24" nic="eth0"
primitive drbdserv ocf:linbit:drbd \
        params drbd_resource="server" \
        op monitor interval="60s"
primitive fsserv ocf:heartbeat:Filesystem \
        params device="/dev/drbd/by-res/server" directory="/EdgeServer"
primitive libvirt-guests lsb:libvirt-guests
primitive libvirtd lsb:libvirtd
primitive mysql ocf:heartbeat:mysql \
        params binary="/usr/bin/mysqld_safe" config="/etc/my.cnf"
datadir="/EdgeServer/mysql" \
        op start interval="0" timeout="40s" \
        op stop interval="0" timeout="60s" \
        meta target-role="Started"
primitive named lsb:named
primitive samba lsb:smb
group services fsserv clusterip libvirtd samba apache mysql
ms drbdservClone drbdserv \
        meta master-max="1" master-node-max="1" clone-max="2"
clone-node-max="1" notify="true"
colocation fs_on_drbd inf: fsserv drbdservClone:Master
order fsserv-after-drbdserv inf: drbdservClone:promote fsserv:start
property $id="cib-bootstrap-options" \
        dc-version="1.1.7-2.mga1-ee0730e13d124c3d58f00016c3376a1de5323cff" \
        cluster-infrastructure="corosync" \
        stonith-enabled="false" \

and here are the logs

After looking for more information, I've added fences in drbd configuration

handlers {
    fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
    after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";
but still without any success...

Any help appreciated




More information about the Pacemaker mailing list