No subject


Fri Oct 28 02:41:40 EDT 2011


The motivation for wanting to use CMAN for this instead, is to ensure
all elements of the cluster stack are making decisions based on the
same membership and quorum data. [17]

[17] A failure to do this can lead to what is called internal
split-brain - a situation where different parts of the stack disagree
about whether some nodes are alive or dead - which quickly leads to
unnecessary down-time and/or data corruption.


We were talking about GFS2 and Pacemaker but the same applies to OCFS2.
If you're just using ocfs2 there is no need for cman.  But if you want
ocfs2 /and/ a cluster manager - you want them all using the same
membership and quorum data.

>
> By the way, will Pacemaker or Corosync log something to the syslog if it
> decides to fence a member? =A0Will it attempt to fence one that has flat
> disappeared, or only one that it has become unable to stop services on?
> I ask because I have a node that recently started spitting out
> "rcu_sched_state detected stall on cpu..." whenever I'm not around. =A0Th=
e
> surviving node recognizes that it has lost contact with this defunct
> node, but by that point the DLM and/or OCFS2 is totally hosed and the
> surviving node requires a hard-restart. =A0I guess my hope is that, were
> fencing actually working on my cluster, the fence would happen before
> the surviving node's DLM/OCFS2 drivers melted down (assuming the real
> issue at hand isn't wiping out DLM/OCFS everywhere before the bad-node
> is determined offline by the good-node).
>
>>> =A0Are there any detriments to using
>>> cman?
>> You saw this one?
>> =A0 http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_fro=
m_Scratch/_adding_cman_support.html
>>
>> There is also additional information in:
>> =A0 http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6-Beta/ht=
ml/Cluster_Administration/index.html
> The first one, yes. =A0The second one, no, and thank you - I'll be
> perusing it for more knowledge.
>>
>>> If I want to add additional nodes to the cluster, will I need to
>>> bring down the whole cluster or restart services on existing nodes to
>>> apply the new cluster.conf?
>> I don't believe so. =A0I think you need the 'cman_tool join' command.
>
> Awesome - I was hoping it would be that easy! =A0Thanks for the help!!
>
> --
>
> Sincerely,
> =A0Matthew O'Connor
>
> -----------------------------------------------------------------
> Sr. Software Engineer
> PGP/GPG Key: 0x55F981C4
> Fingerprint: E5DC A0F8 5A40 E4DA 2CE6 B5A2 014C 2CBF 55F9 81C4
>
> Engineering and Computer Simulations, Inc.
> 11825 High Tech Ave Suite 250
> Orlando, FL 32817
>
> Tel: =A0 407-823-9991 x315
> Fax: =A0 407-823-8299
> Email: matt at ecsorl.com
> Web: =A0 www.ecsorl.com
> -----------------------------------------------------------------
>
> CONFIDENTIAL NOTICE: The information contained in this electronic
> message is legally privileged, confidential and exempt from disclosure
> under applicable law. It is intended only for the use of the individual
> or entity named above. If the reader of this message is not the intended
> recipient, you are hereby notified that any dissemination, distribution
> or copying of this message is strictly prohibited. If you have received
> this communication in error, please notify the sender immediately by
> return e-mail and delete the original message and any copies of it from
> your computer system. Thank you.



More information about the Pacemaker mailing list