[Pacemaker] Nodes not rejoining cluster

Florian Haas florian at hastexo.com
Fri Mar 30 12:33:45 EDT 2012


On Fri, Mar 30, 2012 at 6:09 PM, Gregg Stock <gregg at damagecontrolusa.com> wrote:
> That looks good. They were all the same and had the correct ip addresses.

So you've got both healthy rings, and all 5 nodes have 5 members in
the membership list?

Then this would make it a Pacemaker problem. IIUC the code causing
Pacemaker to discard the update from a node that is "not in our
membership" has actually been removed from 1.1.7[1] so an upgrade may
not be a bad idea, but you'll probably have to wait for a few more
days until packages become available.

Still, out of curiosity, and since you're saying this is a test
cluster: what happens if you shut down corosync and Pacemaker on *all*
the nodes, and bring it back up?

We've had a few people report these "not in our membership" issues on
the list before, and they seem to appear in a very sporadic and
transient fashion, so the root cause (which may well be totally
trivial) hasn't really been found out -- as far as I can tell, at
least. Hence, my question of whether the issue persists after a full
cluster shutdown.

Florian

[1] https://github.com/ClusterLabs/pacemaker/commit/03f6105592281901cc10550b8ad19af4beb5f72f
-- note Andrew will rightfully flame me to a crisp if I've
misinterpreted that commit, so caveat lector. :)

-- 
Need help with High Availability?
http://www.hastexo.com/now




More information about the Pacemaker mailing list