<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">

</head>

<body bgcolor="#ffffff" text="#000000">

Ok, thanks for the note Steven.  I've filed the bug, it is #525589.<br>

<br>

Steven Dake wrote:

<blockquote cite="mid:1253832474.28212.82.camel@localhost.localdomain"

 type="cite">

  <pre wrap="">Remi,

Likely a defect.  We will have to look into it.  Please file a bug as

per instructions on the corosync wiki at <a class="moz-txt-link-abbreviated" href="http://www.corosync.org">www.corosync.org</a>.

On Thu, 2009-09-24 at 16:47 -0600, Remi Broemeling wrote:

  </pre>

  <blockquote type="cite">

    <pre wrap="">I've spent all day working on this; even going so far as to completely

build my own set of packages from the Debian-available ones (which

appear to be different than the Ubuntu-available ones).  It didn't

have any effect on the issue at all: the cluster still freaks out and

becomes a split-brain after a single SIGQUIT.

The debian packages that also demonstrate this behavior were the below

versions:

    cluster-glue_1.0+hg20090915-1~bpo50+1_i386.deb

    corosync_1.0.0-5~bpo50+1_i386.deb

    libcorosync4_1.0.0-5~bpo50+1_i386.deb

    libopenais3_1.0.0-4~bpo50+1_i386.deb

    openais_1.0.0-4~bpo50+1_i386.deb

    pacemaker-openais_1.0.5+hg20090915-1~bpo50+1_i386.deb

These packages were re-built (under Ubuntu Hardy Heron LTS) from the

*.diff.gz, *.dsc, and *.orig.tar.gz files available at

<a class="moz-txt-link-freetext" href="http://people.debian.org/~madkiss/ha-corosync">http://people.debian.org/~madkiss/ha-corosync</a>, and as I said the

symptoms remain exactly the same, both under the configuration that I

list below and the sample configuration that came with these packages.

I also attempted the same with a single IP Address resource associated

with the cluster; just to be sure it wasn't an edge case for a cluster

with no resources; but again that had no effect.

Basically I'm still exactly at the point that I was at yesterday

morning at about 0900.

Remi Broemeling wrote: 

    </pre>

    <blockquote type="cite">

      <pre wrap="">I posted this to the OpenAIS Mailing List

(<a class="moz-txt-link-abbreviated" href="mailto:openais@lists.linux-foundation.org">openais@lists.linux-foundation.org</a>) yesterday, but haven't received

a response and upon further reflection I think that maybe I chose

the wrong list to post it to.  That list seems to be far less about

user support and far more about developer communication.  Therefore

re-trying here, as the archives show it to be somewhat more

user-focused.

The problem is that I'm having an issue with corosync refusing to

shutdown in response to a QUIT signal.  Given the below cluster

(output of crm_mon):

============

Last updated: Wed Sep 23 15:56:24 2009

Stack: openais

Current DC: boot1 - partition with quorum

Version: 1.0.5-3840e6b5a305ccb803d29b468556739e75532d56

2 Nodes configured, 2 expected votes

0 Resources configured.

============

Online: [ boot1 boot2 ]

If I go onto the host 'boot2', and issue the command "killall -QUIT

corosync", the anticipated result would be that boot2 would go

offline (out of the cluster), and all of the cluster processes

(corosync/stonithd/cib/lrmd/attrd/pengine/crmd) would shut-down.

However, this is not occurring, and I don't really have any idea

why.  After logging into boot2, and issuing the command "killall

-QUIT corosync", the result is a split-brain: