[Pacemaker] Is IPMI reliable to avoid DRBD SplitBrain?

Digimer lists at alteeve.ca
Mon Sep 2 09:42:12 EDT 2013


On 02/09/13 08:55, Xiaomin Zhang wrote:
> Hi, guy:
> I followed the standard way to enable the IPMI based STONITH for a
> service which relies on DRBD primary-secondary replication.
> Besides below pacemaker configuration (of cause, STONITH is enabled for
> pacemaker):
>
> primitive suse2-stonith stonith:external/ipmi \
>          params hostname="suse2" ipaddr="XXX" userid="admin"
> passwd="xxx" interface="lan"
> primitive suse4-stonith stonith:external/ipmi \
>          params hostname="suse4" ipaddr="YYY" userid="admin"
> passwd="yyy" interface="lan"
> location st-suse2 suse2-stonith -inf: suse2
> location st-suse4 suse4-stonith -inf: suse4
>
> I also use 'resource-and-stonith' as DRBD global configuration.
> This configuration works for many times with below failure tests:
> 1.  iptables -A INPUT -j DROP
> 2.  echo c > /proc/sysrq-trigger
> 3.  /etc/init.d/network stop
> 4.  reboot
> The failed node will be power cycled the counterpart by IPMI command.
> However, I still get DRBD SplitBrain issue for some time. Does that mean
> IPMI is still not so reliable for DATA integration?
>
> And I was also so confused that for many times, crm-unfence-peer.sh. is
> not called after crm-fence-peer.sh. Does this imply that I have
> something misconfigured?
> Your advice is really appreciated.
> Thanks in advance.

I don't think that using the firewall to block traffic is a good way to 
test. That said, if the failure triggers a reboot, then it's working.

Did you setup the fence-handler in DRBD to use 'crm-fence-peer.sh'?

Please share your 'crm configure show' and 'drbdadm dump'.

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without 
access to education?




More information about the Pacemaker mailing list