[Pacemaker] corosync vs. pacemaker 1.1

Tue Feb 28 04:22:36 EST 2012

On Tue, Feb 28, 2012 at 5:22 PM, Kiss Bence <bence at noc.elte.hu> wrote:
> Hi Andrew,
>
>  did You have the time to look at the bug report? Are there anything missing
> from it?

Just looking now. My queue gets pretty long sometimes.

> An other question. Maybe it better helps for me to understand the problem.
>
> If a node fails, what is the expected behaviour of the pacemaker to recover
> the node with stonith enabled and without it?

Without it we blindly start the service on the remaining node and hope
it was a real failure.
Otherwise its running in two places and your data is toast.

With it, we shoot the node and start the resource on one of the
remaining nodes.

> What is expected from the
> sysadmin in this procedure?

Nothing.

> Bence
>
>
>
> On 02/13/2012 02:09 AM, Andrew Beekhof wrote:
>>
>> On Sat, Feb 11, 2012 at 2:46 AM, Kiss Bence<bence at noc.elte.hu>  wrote:
>>>
>>> Hi,
>>>
>>>
>>> On 01/30/2012 04:00 AM, Andrew Beekhof wrote:
>>>>
>>>>
>>>> On Thu, Jan 26, 2012 at 2:08 AM, Kiss Bence<bence at noc.elte.hu>    wrote:
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> I am newbie to the clustering and I am trying to build a two node
>>>>> active/passive cluster based upon the documentation:
>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>
>>>>> My systems are Fedora 14, uptodate. After forming the cluster as wrote,
>>>>> I
>>>>> started to test it. (resources: drbd->    lvm->    fs ->group of
>>>>> services)
>>>>> Resources moved around, nodes rebooted and killed (first I tried it in
>>>>> virtual environment then also on real machines).
>>>>>
>>>>> After some events the two nodes ended up in a kind of state of
>>>>> split-brain.
>>>>> The crm_mon showed me that the other node is offline at both nodes
>>>>> although
>>>>> the drbd subsystem showed everything in sync and working. The network
>>>>> was
>>>>> not the issue (ping, tcp and udp communications were fine). Nothing
>>>>> changed
>>>>> from the network view.
>>>>>
>>>>> At first the rejoining took place quiet well, but some more events
>>>>> after
>>>>> it
>>>>> took longer and after more event it didn't. The network dump showed me
>>>>> the
>>>>> multicast packets still coming and going. At corosync (crm_node -l) the
>>>>> other node didn't appeared both on them. After trying configuring the
>>>>> cib
>>>>> logs was full of messages like "<the other node>: not in our
>>>>> membership".
>>>>
>>>>
>>>>
>>>> That looks like a pacemaker bug.
>>>> Can you use crm_report to grab logs from about 30 minutes prior to the
>>>> first time you see this log until an hour after please?
>>>>
>>>> Attach that to a bug in bugs.clusterlabs.org and i'll take a look
>>>
>>>
>>>
>>> I had created a bug report: id 5031.
>>
>>
>> Perfect, I'll look there.
>>
>>>
>>> The "split-brain" is lasting every time about 5 minutes. Meanwhile the
>>> two
>>> nodes think that the other node is dead. However the drbd is working
>>> fine,
>>> and properly disallowing the second rebooted node to go Primary. The
>>> crm_node -l shows only the local node.
>>>
>>> Meanwhile one of my question is answered. The multicast issue was a local
>>> network issue. The local netadmin fixed it. Now it works.
>>>
>>> This issue seems to me similar to what James Flatten had reported at 8-th
>>> Feb. ([Pacemaker] Question about cluster start-up in a 2 node cluster
>>> with a
>>> node offline.)
>>>
>>> The stonith-enabled="false" \
>>> and no-quorum-policy="ignore"
>>>
>>> Thanks in advance,
>>> Bence
>>>
>>>
>>>
>>>>
>>>>>
>>>>> I tried to erase the config (crm configure erase, cibadmin -E -f) but
>>>>> it
>>>>> worked only locally. I noticed that the pacemaker process didn't
>>>>> started
>>>>> up
>>>>> normally on the node that was booting after the other. I also tried to
>>>>> remove files from /var/lib/pengine/ and /var/lib/hearbeat/crm/ but only
>>>>> the
>>>>> resources are gone. It didn't help on forming a cluster without
>>>>> resources.
>>>>> The pacemaker process exited some 20 minutes after it started. Manual
>>>>> starting was the same.
>>>>>
>>>>> After digging into google for answers I found nothing helpful. From
>>>>> running
>>>>> tips I changed in the /etc/corosync/service.d/pcmk file the version to
>>>>> 1.1
>>>>> (this is the version of the pacemaker in this distro). I realized that
>>>>> the
>>>>> cluster processes were startup from corosync itself not by pacemaker.
>>>>> Which
>>>>> could be omitted. The cluster forming is stable after this change even
>>>>> after
>>>>> many many events.
>>>>>
>>>>> Now I reread the document mentioned above, and I wonder why it wrote
>>>>> the
>>>>> "Important notice" on page 37. What is wrong theoretically with my
>>>>> scenario?
>>>>
>>>>
>>>>
>>>> Having corosync start the daemons worked well for some but not others,
>>>> thus it was unreliable.
>>>> The notice points out a major difference between the two operating
>>>> modes so that people will not be caught by surprise when pacemaker
>>>> does not start.
>>>>
>>>>> Why does it working? Why didn't work the config suggested by the
>>>>> document?
>>>>>
>>>>> Tests were done firsth on virtual machines of a Fedora 14 (1 CPU core,
>>>>> 512Mb
>>>>> ram, 10G disk, 1G drbd on logical volume, physical  volume on drbd
>>>>> forming
>>>>> volgroup named cluster.)/node.
>>>>>
>>>>> Then on real machines. They have more cpu cores (4), more RAM (4G) and
>>>>> more
>>>>> disk (mirrored 750G), 180G drbd, and 100M garanteed routed link between
>>>>> the
>>>>> nodes 5 hops away.
>>>>>
>>>>> By the way how should one configure the corosync to work on multicast
>>>>> routed
>>>>> network? I had to create an openvpn tap link between the real nodes for
>>>>> working. The original config with public IP-s didn't worked. Is
>>>>> corosync
>>>>> equipped to cope with the multicast pim messages? Or it was a firewall
>>>>> issue.
>>>>>
>>>>> Thanks in advance,
>>>>> Bence
>>>>>
>>>>> _______________________________________________
>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>
>>>>> Project Home: http://www.clusterlabs.org
>>>>> Getting started:
>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>> Bugs: http://bugs.clusterlabs.org
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>>
>>>
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org