[ClusterLabs] Antw: [EXT] no-quorum-policy=stop never executed, pacemaker stuck in election/integration, corosync running in "new membership" cycles with itself

Wed Jun 2 04:47:01 EDT 2021

lge> > I would have expected corosync to come back with a "stable
lge> > non‑quorate membership" of just itself within a very short
lge> > period of time, and pacemaker winning the
lge> > "election"/"integration" with just itself, and then trying
lge> > to call "stop" on everything it knows about.
ken> 
ken> That's what I'd expect, too. I'm guessing the corosync cycling is
ken> what's causing the pacemaker cycling, so I'd focus on corosync first.

Any Corosync folks around with some input?
What may cause corosync on an isolated (with iptables DROP rules)
node to keep creating "new membership" with only itself?

Is it a problem with the test setup maybe?
Does an isolated corosync node need to be able
to send the token to itself?
Do the "iptables DROP" rules on the outgoing interfaces prevent that?

On Tue, Jun 01, 2021 at 10:31:21AM -0500, kgaillot at redhat.com wrote:
> On Tue, 2021-06-01 at 13:18 +0200, Ulrich Windl wrote:
> > Hi!
> > 
> > I can't answer, but I doubt the usefulness of
> > "no-quorum-policy=stop": If nodes loose quorum, they try to
> > stop all resources, but "remain" in the cluster (will respond
> > to network queries (if any arrive).  If one of those "stop"s
> > fails, the other part of the cluster never knows.  So what can
> > be done? Should the "other(left)" part of the cluster start
> > resources, assuming the "other(right)" part of the cluster had
> > stopped resources successfully?
> 
> no-quorum-policy only affects what the non-quorate partition will do.
> The quorate partition will still fence the non-quorate part if it is
> able, regardless of no-quorum-policy, and won't recover resources until
> fencing succeeds.

The context in this case is: "fencing by storage".
DRBD 9 has a "drbd quorum" feature, where you can ask it
to throw IO errors (or freeze) if DRBD quorum is lost,
so data integrity on network partition is protected,
even without fencing on the pacemaker level.

It is rather a "convenience" that the non-quorate
pacemaker on the isolated node should stop everything
that still "survived", especially the umount is necessary
for DRBD on that node to become secondary again,
which is necessary to be able to re-integrate later
when connectivity is restored.

Yes, fencing on the node level is still necessary for other
scenarios.  But with certain scenarios, avoiding a node level
fence while still being able to also avoid "trouble" once
connectivity is restored would be nice.

And would work nicely here, if the corosync membership
of the isolated node would be stable enough for pacemaker
to finalize "integration" with itself and then (try to) stop
everything, so we have a truely "idle" node when connectivity is
restored.

"trouble":
spurious restart of services ("resource too active ..."),
problems with re-connecting DRBD ("two primaries not allowed")

> > > pcmk 2.0.5, corosync 3.1.0, knet, rhel8
> > > I know fencing "solves" this just fine.
> > > 
> > > what I'd like to understand though is: what exactly is
> > > corosync or pacemaker waiting for here, why does it not
> > > manage to get to the stage where it would even attempt to
> > > "stop" stuff?
> > > 
> > > two "rings" aka knet interfaces.
> > > node isolation test with iptables,
> > > INPUT/OUTPUT ‑j DROP on one interface,
> > > shortly after on the second as well.
> > > node loses quorum (obviously).
> > > 
> > > pacemaker is expected to no‑quorum‑policy=stop,
> > > but is "stuck" in Election ‑> Integration,
> > > while corosync "cycles" bewteen "new membership" (with only
> > > itself, obviously) and "token has not been received in ...",
> > > "sync members ...", "new membership has formed ..."