<div dir="ltr"><div><div><div><div><div><div>Hi, Digimer:<br></div>Below is the output of drbdadm dump:<br></div># /etc/drbd.conf<br>common {<br> protocol C;<br> net {<br> after-sb-0pri discard-zero-changes;<br>
after-sb-1pri consensus;<br> after-sb-2pri disconnect;<br> cram-hmac-alg sha512;<br> shared-secret acde;<br> }<br> disk {<br> on-io-error detach;<br> fencing resource-and-stonith;<br>
}<br> syncer {<br> rate 33M;<br> }<br> startup {<br> wfc-timeout 120;<br> }<br> handlers {<br> fence-peer /usr/lib/drbd/crm-fence-peer.sh;<br> after-resync-target /usr/lib/drbd/crm-unfence-peer.sh;<br>
}<br>}<br><br># resource r0 on suse4: not ignored, not stacked<br>resource r0 {<br> on suse2 {<br> device /dev/drbd0 minor 0;<br> disk /dev/sdc1;<br> address ipv4 XXX:7789;<br>
meta-disk internal;<br> }<br> on suse4 {<br> device /dev/drbd0 minor 0;<br> disk /dev/sdc1;<br> address ipv4 YYY:7789;<br> meta-disk internal;<br>
}<br>}<br></div>And for crm configure, please find below configuration:<br>primitive drbd1 ocf:linbit:drbd \<br> params drbd_resource="r0" \<br> op monitor interval="15s"<br>primitive fs1 ocf:heartbeat:Filesystem \<br>
op monitor interval="15s" \<br> params device="/dev/drbd0" directory="/opt/drbd" fstype="ext3" \<br> meta target-role="Started"<br>primitive suse2-stonith stonith:external/ipmi \<br>
params hostname="suse2" ipaddr="XXX" userid="admin" passwd="xxx" interface="lan"<br>primitive suse4-stonith stonith:external/ipmi \<br> params hostname="suse4" ipaddr="YYY" userid="admin" passwd="yyy" interface="lan"<br>
ms ms_drbd1 drbd1 \<br> meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" target-role="Started"<br>location drbd-fence-by-handler-ms_drbd1 ms_drbd1 \<br>
rule $id="drbd-fence-by-handler-rule-ms_drbd1" $role="Master" -inf: #uname ne suse4<br>location st-suse2 suse2-stonith -inf: suse2<br>location st-suse4 suse4-stonith -inf: suse4<br>colocation fs_on_drbd inf: fs1 ms_drbd1:Master<br>
dc-version="1.1.6-b988976485d15cb702c9307df55512d323831a5e" \<br> cluster-infrastructure="openais" \<br> expected-quorum-votes="3" \<br> stonith-enabled="true" \<br>
last-lrm-refresh="1378051434"<br>rsc_defaults $id="rsc-options" \<br> resource-stickiness="100"<br></div>I think drbd-fence-by-handler-rule-ms_drbd1 rule is generated by crm-fence-peer.sh. And this keeps existing as the crm-unfence-peer.sh is never called since last fail over.<br>
</div>What's wrong with my configuration?<br></div>Thanks.<br></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Mon, Sep 2, 2013 at 9:42 PM, Digimer <span dir="ltr"><<a href="mailto:lists@alteeve.ca" target="_blank">lists@alteeve.ca</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">On 02/09/13 08:55, Xiaomin Zhang wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hi, guy:<br>
I followed the standard way to enable the IPMI based STONITH for a<br>
service which relies on DRBD primary-secondary replication.<br>
Besides below pacemaker configuration (of cause, STONITH is enabled for<br>
pacemaker):<br>
<br>
primitive suse2-stonith stonith:external/ipmi \<br>
params hostname="suse2" ipaddr="XXX" userid="admin"<br>
passwd="xxx" interface="lan"<br>
primitive suse4-stonith stonith:external/ipmi \<br>
params hostname="suse4" ipaddr="YYY" userid="admin"<br>
passwd="yyy" interface="lan"<br>
location st-suse2 suse2-stonith -inf: suse2<br>
location st-suse4 suse4-stonith -inf: suse4<br>
<br>
I also use 'resource-and-stonith' as DRBD global configuration.<br>
This configuration works for many times with below failure tests:<br>
1. iptables -A INPUT -j DROP<br>
2. echo c > /proc/sysrq-trigger<br>
3. /etc/init.d/network stop<br>
4. reboot<br>
The failed node will be power cycled the counterpart by IPMI command.<br>
However, I still get DRBD SplitBrain issue for some time. Does that mean<br>
IPMI is still not so reliable for DATA integration?<br>
<br>
And I was also so confused that for many times, crm-unfence-peer.sh. is<br>
not called after crm-fence-peer.sh. Does this imply that I have<br>
something misconfigured?<br>
Your advice is really appreciated.<br>
Thanks in advance.<br>
</blockquote>
<br></div></div>
I don't think that using the firewall to block traffic is a good way to test. That said, if the failure triggers a reboot, then it's working.<br>
<br>
Did you setup the fence-handler in DRBD to use 'crm-fence-peer.sh'?<br>
<br>
Please share your 'crm configure show' and 'drbdadm dump'.<span class="HOEnZb"><font color="#888888"><br>
<br>
-- <br>
Digimer<br>
Papers and Projects: <a href="https://alteeve.ca/w/" target="_blank">https://alteeve.ca/w/</a><br>
What if the cure for cancer is trapped in the mind of a person without access to education?<br>
</font></span></blockquote></div><br></div>