<div dir="ltr"><div><div><div><div><div><div><div>Hi,<br><br></div> I know, no fencing configuration creates issue.<br></div>But the current scenario is due to fencing??<br></div>The syslog isn't revealing much about the same.<br></div>I would love to configure fencing but currently need some solution to overcome the current scenario, if you say fencing is the only solution then I might have to do it remotely.<br><br></div>OS -> UBUNTU 12.04 (64 bits)<br></div>DRBD -> 8.3.11<br><br></div>Thanks for the quick reply<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Oct 28, 2014 at 11:19 AM, Digimer <span dir="ltr"><<a href="mailto:lists@alteeve.ca" target="_blank">lists@alteeve.ca</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">On 28/10/14 01:39 AM, kamal kishi wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hi all,<br>
<br>
Facing a strange issue which I'm not able to resolve as I'm not<br>
sure where what is going wrong as the logs is not giving away much to my<br>
knowledge.<br>
<br>
Issue -<br>
Have configured 2 Node Clustering, have attached the configuration<br>
file(New CRM conf of BIC.txt).<br>
<br>
If Server2 which is primary is shutdown(forcefully by turning off the<br>
switch), Server1 restarts within few seconds and starts the resources.<br>
Even though the Server1 restarts and starts the resources the time taken<br>
to recover is too long to convince the clients and the current working<br>
is erroneous is what I feel.<br>
<br>
Have attached the syslog with this mail.(syslog)<br>
<br>
Do go through the same and let know a solution to resolve the same as<br>
the setup is in clients place.<br>
<br>
--<br>
Regards,<br>
Kamal Kishore B V<br>
</blockquote>
<br></div></div>
You really need fencing, first and foremost. This will cause the survivor to put the lost node into a known state and then safely begin taking over lost services. Do your nodes have IPMI (or iRMC, iLO, DRAC, etc)? If so, setting up stonith is easy.<br>
<br>
Once it is setup, configure DRBD to use the fence-handler 'crm-fence-peer.sh' and change the fencing policy to 'resource-and-stonith'. Without this, you will get split-brains and fail-over will be unpredictable.<br>
<br>
Once stonith is configured and tested in pacemaker and you've hooked DRBD's fencing into pacemaker, see if you problem remains. If it does, on both nodes, run: 'tail -f -n 0 /var/log/messages', kill a node and wait for things to settle down. Share the log output here.<br>
<br>
Please also tell us your OS, pacemaker, drbd and corosync versions.<span class="HOEnZb"><font color="#888888"><br>
<br>
-- <br>
Digimer<br>
Papers and Projects: <a href="https://alteeve.ca/w/" target="_blank">https://alteeve.ca/w/</a><br>
What if the cure for cancer is trapped in the mind of a person without access to education?<br>
<br>
______________________________<u></u>_________________<br>
Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org" target="_blank">Pacemaker@oss.clusterlabs.org</a><br>
<a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/<u></u>mailman/listinfo/pacemaker</a><br>
<br>
Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>
Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/<u></u>doc/Cluster_from_Scratch.pdf</a><br>
Bugs: <a href="http://bugs.clusterlabs.org" target="_blank">http://bugs.clusterlabs.org</a><br>
</font></span></blockquote></div><br><br clear="all"><br>-- <br>Regards,<br>Kamal Kishore B V<br>
</div>