[ClusterLabs] Coming in Pacemaker 2.0.5: on-fail=demote / no-quorum-policy=demote

clusterlabs at t.poki.me clusterlabs at t.poki.me
Thu Aug 13 11:34:30 EDT 2020

Sorry, followed too much into the intermixing of two rather unrelated
things, when failure occurs and when quorum is lost.  I've meant to
dedicate the comment solely to the latter, but managed to cross that
line.  Corrections below:

On 8/13/20 12:32 PM, clusterlabs at t.poki.me wrote:
> wanted to point out one thing that occurred to me when thinking about
> the paragraph below.
> On 8/12/20 8:57 PM, Ken Gaillot wrote:
>> Similarly, Pacemaker offers the cluster-wide "no-quorum-policy" option
>> to specify what happens to resources when quorum is lost (the default
>> being to stop them). With 2.0.5, "demote" will be a possible value here
>> as well, and will mean "demote all promotable resources and stop all
>> other resources".
>> The intended use case is an application that cannot cause any harm
>> after being demoted, and may be useful in a demoted role even if there
>> is no quorum. A database that operates read-only when demoted and
>> doesn't depend on any non-promotable resources might be an example.
> Perhaps not that expected corollary with this cluster-wide setting
> (only acknowledged when the cluster is upgraded in its entirety?), if
> I understand it correctly, is that previously promoted resource will
> get stopped anyway once it depends on a simple resource that doesn't
> specify "on-fail" on its own (putting global/resource defaults aside).

Scratch trailing " that ...", there's no individual per-resource
(nor per-resource-operation) override.

Would be, though, interesting to consider allowing for an escape
"no-quorum-policy=as-active-state-failure" that would turn loss
of quorum into started/promoted/demoted state maintenance failure,
which could then trigger individual on-fail response.  Who knows.

> It is this implicit resource "composability" (a.k.a. resource trees,
> with some tweaks applicable right at this "composed service" level)
> idea of long declined RGManager (RIP) that is still quite an appealing
> and natural way (despite having less expressive power in general) one
> can think of composed services (i.e. the behaviour of a final
> brings-me-value unit rather than sum of moving parts behind it).
> Through the prism of this tree model, if it could be proved that no
> other simple resources share dependency on any of the prerequisites
> with this promoted resource to be demoted because of "on fail" event,

s/on fail/quorum lost/

> it would be intuitive to expect they will be kept running and hence
> will prevent this promoted resource from consequentially being stopped
> despite just a weaker form of its demotion is the first and foremost
> choice requested by the user.  Similarly with clones and other
> promotable prerequisites, except it might be wise to demote them as
> well if it would not conflict with demotion of the dependent promoted
> resource that just suffered a failure.
> Rationale for this is that prerequisite resource _solely_ consumed by
> resources that don't need quorum as well hardly needs quorum on its own,
> otherwise this is a conflicting/fishy configuration in some way.
> (It would likewise be interesting and configuration-proofing-friendly
> to investigate such "composing rules of soundness".)
> I see there are some practical limits to these semi-recursive and
> potentially explosive graph problems, but there's also a question
> what's an intuitively expected behaviour, and possible disconnect
> could at least be addressed in the documentation.
> [Alternatively, some kind of "back-propagation" (intuitively,
> "this terminal resource has powers to steer the behaviour of its
> prerequisites, as the configuration of this terminal resource is
> what's wanted by the outer surroundings of this box, afterall)
> flag could be devised such that it would override any behaviour of
> the prerequisite resources on a subset of conflicting options, like
> on-fail, given that there is no conflict with any other resource

Again, scrach "like on-fail", in this case it would perhaps mean
altering individual yet externally not directly configurable approach
of Pacemaker towards the resource in particular circumstance.

(It's part of the conceivability problem amongst different audiences,
since not every internal flag/tracking and conditional handling maps
to user configurable item -- main pillar of explaining the mechanics
to users -- in particular context.)

> dependent on the same resource.]
> Thanks & cheers


More information about the Users mailing list