<html><body><div style="color:#000; background-color:#fff; font-family:verdana, helvetica, sans-serif;font-size:12pt"><div><font size="2"><span>Hi,</span></font></div><div><font size="2"><br><span></span></font></div><div><font size="2"><span>I've tried moving the corosync startup from S20 to S98 but the issue is still there.</span></font></div><div><font size="2"><br><span></span></font></div><div><span><font size="2">Maybe I'll have to remove it from init and write an upstart for corosync.</font></span></div><div><br></div><div style="font-family: verdana,helvetica,sans-serif; font-size: 12pt;"><div style="font-family: times new roman,new york,times,serif; font-size: 12pt;"><font face="Arial" size="2"><hr size="1"><b><span style="font-weight: bold;">From:</span></b> Andreas Kurz <andreas@hastexo.com><br><b><span style="font-weight: bold;">To:</span></b> pacemaker@oss.clusterlabs.org<br><b><span style="font-weight: bold;">Sent:</span></b> Tuesday,
25 October 2011 6:50 PM<br><b><span style="font-weight: bold;">Subject:</span></b> Re: [Pacemaker] Cluster goes to (unmanaged) Failed state when both nodes are rebooted together<br></font><br>hello,<br><br>On 10/25/2011 09:17 AM, ihjaz Mohamed wrote:<br>> If I start the corosync together on both the servers, it comes up good.<br>> So am just wondering how is this different from corosync being started<br>> by the server during boot up.<br><br>maybe corosync ist started to early on system boot when network<br>connectivity is not fully established.<br><br>Regards,<br>Andreas<br><br>-- <br>Need help with Pacemaker?<br><a href="http://www.hastexo.com/now" target="_blank">http://www.hastexo.com/now</a><br><br>> <br>> <br>> ------------------------------------------------------------------------<br>> *From:* Andreas Kurz <<a ymailto="mailto:andreas@hastexo.com" href="mailto:andreas@hastexo.com">andreas@hastexo.com</a>><br>> *To:*
<a ymailto="mailto:pacemaker@oss.clusterlabs.org" href="mailto:pacemaker@oss.clusterlabs.org">pacemaker@oss.clusterlabs.org</a><br>> *Sent:* Monday, 24 October 2011 9:30 PM<br>> *Subject:* Re: [Pacemaker] Cluster goes to (unmanaged) Failed state when<br>> both nodes are rebooted together<br>> <br>> hello,<br>> <br>> On 10/24/2011 05:21 PM, ihjaz Mohamed wrote:<br>>> Its part of the requirement given to me to support this solution on<br>>> servers without stonith devices. So I cannot enable the stonith.<br>> <br>> Too bad, than you have to live with some limitations of this setup. You<br>> could add some random wait to/before corosync start ... or simply: don't<br>> reboot them at the same time ;-)<br>> <br>> But it would also be interesting why FloatingIP_stop_0 returns an error<br>> on both nodes ... logs should tell you what happened.<br>> <br>> .... and remove nic="eth0:0", you must not
define any alias here but<br>> only the nic itself.<br>> <br>> Regards,<br>> Andreas<br>> <br>> -- <br>> Need help with Pacemaker?<br>> <a href="http://www.hastexo.com/now" target="_blank">http://www.hastexo.com/now</a><br>> <br>> <br>>><br>>> ------------------------------------------------------------------------<br>>> *From:* Alan Robertson <<a ymailto="mailto:alanr@unix.sh" href="mailto:alanr@unix.sh">alanr@unix.sh</a> <mailto:<a ymailto="mailto:alanr@unix.sh" href="mailto:alanr@unix.sh">alanr@unix.sh</a>>><br>>> *To:* ihjaz Mohamed <<a ymailto="mailto:ihjazmohamed@yahoo.co.in" href="mailto:ihjazmohamed@yahoo.co.in">ihjazmohamed@yahoo.co.in</a><br>> <mailto:<a ymailto="mailto:ihjazmohamed@yahoo.co.in" href="mailto:ihjazmohamed@yahoo.co.in">ihjazmohamed@yahoo.co.in</a>>>; The Pacemaker clusterFloatingIP_stop_0<br>>> resource manager <<a
ymailto="mailto:pacemaker@oss.clusterlabs.org" href="mailto:pacemaker@oss.clusterlabs.org">pacemaker@oss.clusterlabs.org</a><br>> <mailto:<a ymailto="mailto:pacemaker@oss.clusterlabs.org" href="mailto:pacemaker@oss.clusterlabs.org">pacemaker@oss.clusterlabs.org</a>>><br>>> *Sent:* Monday, 24 October 2011 8:22 PM<br>>> *Subject:* Re: [Pacemaker] Cluster goes to (unmanaged) Failed state when<br>>> both nodes are rebooted together<br>>><br>>> Setting no-quorum-policy to ignore and disabling stonith is not a good<br>>> idea. You're sort of inviting the cluster to do screwed up things.<br>>><br>>><br>>> On 10/24/2011 08:23 AM, ihjaz Mohamed wrote:<br>>>> Hi All,<br>>>><br>>>> I 've pacemaker running with corosync. Following is my CRM configuration.<br>>>><br>>>> node soalaba56<br>>>> node soalaba63<br>>>> primitive FloatingIP
ocf:heartbeat:IPaddr2 \<br>>>> params ip="<floating_ip>" nic="eth0:0"<br>>>> primitive acestatus lsb:acestatus \<br>>>> primitive pingd ocf:pacemaker:ping \<br>>>> params host_list="<gateway_ip>" multiplier="100" \<br>>>> op monitor interval="15s" timeout="5s"<br>>>> group HAService FloatingIP acestatus \<br>>>> meta target-role="Started"<br>>>> clone pingdclone pingd \<br>>>> meta globally-unique="false"<br>>>> location ip1_location FloatingIP \<br>>>> rule $id="ip1_location-rule" pingd: defined pingd<br>>>> property $id="cib-bootstrap-options" \<br>>>> <br>>>> dc-version="1.1.5-5.el6-01e86afaaa6d4a8c4836f68df80ababd6ca3902f"
\<br>>>> cluster-infrastructure="openais" \<br>>>> expected-quorum-votes="2" \<br>>>> stonith-enabled="false" \<br>>>> no-quorum-policy="ignore" \<br>>>> last-lrm-refresh="1305736421"<br>>>> ----------------------------------------------------------------------<br>>>><br>>>> When I reboot both the nodes together, cluster goes into an<br>>>> (unmanaged) Failed state as shown below.<br>>>><br>>>><br>>>> ============<br>>>> Last updated: Mon Oct 24 08:10:42 2011<br>>>> Stack: openais<br>>>> Current DC: soalaba63 - partition with quorum<br>>>> Version: 1.1.5-5.el6-01e86afaaa6d4a8c4836f68df80ababd6ca3902f<br>>>> 2 Nodes configured, 2 expected votes<br>>>> 2 Resources
configured.<br>>>> ============<br>>>><br>>>> Online: [ soalaba56 soalaba63 ]<br>>>><br>>>> Resource Group: HAService<br>>>> FloatingIP (ocf::heartbeat:IPaddr2) Started (unmanaged)<br>>>> FAILED[ soalaba63 soalaba56 ]<br>>>> acestatus (lsb:acestatus): Stopped<br>>>> Clone Set: pingdclone [pingd]<br>>>> Started: [ soalaba56 soalaba63 ]<br>>>><br>>>> Failed actions:<br>>>> FloatingIP_stop_0 (node=soalaba63, call=7, rc=1, status=complete):<br>>>> unknown error<br>>>> FloatingIP_stop_0 (node=soalaba56, call=7, rc=1, status=complete):<br>>>> unknown error<br>>>><br>>
------------------------------------------------------------------------------<br>>>><br>>>> This happens only when the reboot is done simultaneously on both the<br>>>> nodes. If reboot is done with some interval in between this is not<br>>>> seen. Looking into the logs I see that when the nodes come up<br>>>> resources are started on both the nodes and then it tries to stop the<br>>>> started resources and fails there.<br>>>><br>>>> I've attached the logs.<br>>>><br>>>><br>>>><br>>>> _______________________________________________<br>>>> Pacemaker mailing list: <a ymailto="mailto:Pacemaker@oss.clusterlabs.org" href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br>> <mailto:<a ymailto="mailto:Pacemaker@oss.clusterlabs.org"
href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a>><br>> <mailto:<a ymailto="mailto:Pacemaker@oss.clusterlabs.org" href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br>> <mailto:<a ymailto="mailto:Pacemaker@oss.clusterlabs.org" href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a>>><br>>>> <a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>>>><br>>>> Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>>>> Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>>>> Bugs:<br>> <a href="http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker"
target="_blank">http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker</a><br>>><br>>><br>>> --<br>>> Alan Robertson <<a ymailto="mailto:alanr@unix.sh" href="mailto:alanr@unix.sh">alanr@unix.sh</a> <mailto:<a ymailto="mailto:alanr@unix.sh" href="mailto:alanr@unix.sh">alanr@unix.sh</a>>><br>> <mailto:<a ymailto="mailto:alanr@unix.sh" href="mailto:alanr@unix.sh">alanr@unix.sh</a> <mailto:<a ymailto="mailto:alanr@unix.sh" href="mailto:alanr@unix.sh">alanr@unix.sh</a>>><br>>><br>>> "Openness is the foundation and preservative of friendship... Let me<br>> claim from you at all times your undisguised opinions." - William<br>> Wilberforce<br>>><br>>><br>>><br>>><br>>><br>>> _______________________________________________<br>>> Pacemaker mailing list: <a ymailto="mailto:Pacemaker@oss.clusterlabs.org"
href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br>> <mailto:<a ymailto="mailto:Pacemaker@oss.clusterlabs.org" href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a>><br>>> <a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>>><br>>> Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>>> Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>>> Bugs:<br>> <a href="http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker" target="_blank">http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker</a><br>> <br>> <br>> <br>> _______________________________________________<br>>
Pacemaker mailing list: <a ymailto="mailto:Pacemaker@oss.clusterlabs.org" href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br>> <mailto:<a ymailto="mailto:Pacemaker@oss.clusterlabs.org" href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a>><br>> <a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>> <br>> Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>> Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>> Bugs:<br>> <a href="http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker" target="_blank">http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker</a><br>> <br>> <br>> <br>>
<br>> _______________________________________________<br>> Pacemaker mailing list: <a ymailto="mailto:Pacemaker@oss.clusterlabs.org" href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br>> <a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>> <br>> Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>> Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>> Bugs: <a href="http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker" target="_blank">http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker</a><br><br><br><br><br><br>_______________________________________________<br>Pacemaker mailing list: <a
ymailto="mailto:Pacemaker@oss.clusterlabs.org" href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br><a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br><br>Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>Bugs: <a href="http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker" target="_blank">http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker</a><br><br><br></div></div></div></body></html>