<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">

</head>

<body bgcolor="#ffffff" text="#000000">

I've spent all day working on this; even going so far as to completely

build my own set of packages from the Debian-available ones (which

appear to be different than the Ubuntu-available ones).  It didn't have

any effect on the issue at all: the cluster still freaks out and

becomes a split-brain after a single SIGQUIT.<br>

<br>

The debian packages that also demonstrate this behavior were the below

versions:<br>

    cluster-glue_1.0+hg20090915-1~bpo50+1_i386.deb<br>

    corosync_1.0.0-5~bpo50+1_i386.deb<br>

    libcorosync4_1.0.0-5~bpo50+1_i386.deb<br>

    libopenais3_1.0.0-4~bpo50+1_i386.deb<br>

    openais_1.0.0-4~bpo50+1_i386.deb<br>

    pacemaker-openais_1.0.5+hg20090915-1~bpo50+1_i386.deb<br>

<br>

These packages were re-built (under Ubuntu Hardy Heron LTS) from the

*.diff.gz, *.dsc, and *.orig.tar.gz files available at <a

 href="http://people.debian.org/%7Emadkiss/ha-corosync"

 class="external free"

 title="http://people.debian.org/~madkiss/ha-corosync" rel="nofollow">http://people.debian.org/~madkiss/ha-corosync</a>,

and as I said the symptoms remain exactly the same, both under the

configuration that I list below and the sample configuration that came

with these packages.  I also attempted the same with a single IP

Address resource associated with the cluster; just to be sure it wasn't

an edge case for a cluster with no resources; but again that had no

effect.<br>

<br>

Basically I'm still exactly at the point that I was at yesterday

morning at about 0900.<br>

<br>

Remi Broemeling wrote:

<blockquote cite="mid:4ABB88B0.8010204@nexopia.com" type="cite">I

posted this to the OpenAIS Mailing List

(<a moz-do-not-send="true" class="moz-txt-link-abbreviated"

 href="mailto:openais@lists.linux-foundation.org">openais@lists.linux-foundation.org</a>)

yesterday, but haven't received a

response and upon further reflection I think that maybe I chose the

wrong list to post it to.  That list seems to be far less about user

support and far more about developer communication.  Therefore

re-trying here, as the archives show it to be somewhat more

user-focused.<br>

  <br>

The problem is that I'm having an issue with corosync refusing to

shutdown in response to a

QUIT signal.  Given the below cluster (output of crm_mon):<br>

  <tt><br>

============<br>

Last updated: Wed Sep 23 15:56:24 2009<br>

Stack: openais<br>

Current DC: boot1 - partition with quorum<br>

Version: 1.0.5-3840e6b5a305ccb803d29b468556739e75532d56<br>

2 Nodes configured, 2 expected votes<br>

0 Resources configured.<br>

============<br>

  <br>

Online: [ boot1 boot2 ]</tt><br>

  <br>

If I go onto the host 'boot2', and issue the command "killall -QUIT

corosync", the anticipated result would be that boot2 would go offline

(out of the cluster), and all of the cluster processes

(corosync/stonithd/cib/lrmd/attrd/pengine/crmd) would shut-down. 

However, this is not occurring, and I don't really have any idea why. 

After logging into boot2, and issuing the command "killall -QUIT

corosync", the result is a split-brain:<br>

  <br>