[Pacemaker] How to avoid or automatically resolve Split-Brain issue of DRBD

Xiaomin Zhang zhangxiaomin at gmail.com
Sun Sep 1 09:03:43 EDT 2013


Hi, Digimer:
Thanks for the detailed explanation.
I followed the guide from clusterlab doc and configure below IPMI based
stonith resources for my DRBD related service:
primitive suse2-stonith stonith:external/ipmi \
        params hostname="suse2" ipaddr="XXX" userid="admin" passwd="xxx"
interface="lan"
primitive suse4-stonith stonith:external/ipmi \
        params hostname="suse4" ipaddr="YYY" userid="admin" passwd="yyy"
interface="lan"
location st-suse2 suse2-stonith -inf: suse2
location st-suse4 suse4-stonith -inf: suse4

After enabling IPMI device and channel authentication, I use below command
to cut down the link of DRBD primary machine:
iptables -A INPUT -j DROP
After about 1 seconds, I can see that pacemaker scheduled the secondary
DRBD machine sends IPMI reset command to power cycle the primary machine.
However, the bad thing is that the resource is keeping "Stopped", and never
failed over to the secondary machine.
"crm status" shows that the primary machine is under "OFFLINE" statue, and
all resource are not started on the secondary machine which is supposed to
be.
Is this because that the failed primary node is fenced so that it blocked
pacemaker to schedule the resource on the secondary machine?
Your hints are really appreciated.
Thanks.


On Thu, Aug 29, 2013 at 1:55 AM, Digimer <lists at alteeve.ca> wrote:

> On 28/08/13 13:13, Xiaomin Zhang wrote:
>
>> Hi, Gurus:
>> I've a simple master-slave setup for a mirrored DRBD storage: This
>> storage is written by a daemon Java application server to produce
>> transaction data.
>> node Lhs072gkz \
>>          attributes standby="on"
>> node Lpplj9jb4
>> node Lvoim0kaw
>> primitive drbd1 ocf:linbit:drbd \
>>          params drbd_resource="r0" \
>>          op monitor interval="15s"
>> ms ms_drbd1 drbd1 \
>>          meta master-max="1" master-node-max="1" clone-max="2"
>> clone-node-max="1" notify="true" target-role="Started"
>> location drbd-fence-by-handler-ms_drbd1 ms_drbd1 \
>>          rule $id="drbd-fence-by-handler-**rule-ms_drbd1" $role="Master"
>> -inf: #uname ne
>> Lpplj9jb4
>>
>> It seems Split-Brains is very likely to happen when I reboot the slave
>> machine even the Java application is just writing nothing on the DRBD
>> storage.
>> Is this an expected behavior?
>>
>> And I found some topics about automatically recover from Split-Brain for
>> DRBD () It just says to put some configurations in DRBD, all things
>> should work. Is this a good practice?
>> Thanks.
>>
>
> No, split-brains are not at all expected behaviour, but they happen when
> things are not setup properly.
>
> The best thing to do is to avoid a split-brain in the first place, which
> is easy to do if you setup (working) stonith/fencing.
>
> If you configure stonith in pacemaker using IPMI (the most common method)
> and test it to make sure nodes reboot on failure, you can then "hook" drbd
> into pacemaker's fencing. You do this by setting the fence policy to
> "resource-and-stonith" and then tell DRBD to use the "crm-fence-peer.sh"
> fence handler.
>
> This tells DRBD that, if the peer fails (or vanishes), to block IO and
> call a fence. The fence handler is then invoked which calls pacemaker and
> says "please fence node X". When pacemaker succeeds, it will tell the
> handler which in turn tells DRBD that it's now safe to resume IO. One of
> the nodes will be dead so you will avoid the split-brain in the first place.
>
> If your servers have IPMI, iLO, iDRAC, RSA, etc, you can use the
> 'fence_ipmilan' fence agent in your pacemaker configuration. If you need
> help with this, just say.
>
> Cheers
>
> digimer
>
> --
> Digimer
> Papers and Projects: https://alteeve.ca/w/
> What if the cure for cancer is trapped in the mind of a person without
> access to education?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130901/e8e62afd/attachment-0002.html>


More information about the Pacemaker mailing list