[ClusterLabs] Fuzzy/misleading references to "restart" of a resource
jpokorny at redhat.com
Thu Dec 5 08:55:14 EST 2019
On 05/12/19 10:41 +0300, Andrei Borzenkov wrote:
> On Thu, Dec 5, 2019 at 1:04 AM Jan Pokorný <jpokorny at redhat.com> wrote:
>> On 04/12/19 21:19 +0100, Jan Pokorný wrote:
>>> OTOH, this enforced split of state transitions is perhaps what makes
>>> the transaction (comprising perhaps countless other interdependent
>>> resources) serializable and thus feasible at all (think: you cannot
>>> nest any further handling -- so as to satisfy given constraints -- in
>>> between stop and start when that's an atom, otherwise), and that's
>>> exactly how, say, systemd approaches that, likely for that very reason:
>> Yet, systemd started to allow for certain stop-start ("restart")
>> optimizations at "stop" phase, I've just learnt:
>> But it doesn't merge/atomicize the two discrete steps, still.
> systemd development consists of series of ad hoc single use case
> extensions, each done completely isolated, without considering impact
> on other parts which is usually "fixed" by adding yet another ad hoc
> extension. I do not think that is the best example to follow.
Didn't meant to run into this debate, noticed this was perhaps
mainly to satisfy their in-project services, but nonetheless,
pragmatic value for a wide audience here is that any "why being
stopped?" discrimination is now possible, lending itself to
"restart optimization enabler" label should that be handy.
Re style of evolutionary additions that are perhaps too tunnel-visioned,
you'll find examples everywhere, incl. ClusterLabs/cluster projects :-)
Common problem appears to be a lack of formalized/documented enough (as
if it wasn't a proprietary knowledge but rather a fully baked
programming interface) intermediate representations (next to some
further confinements related to transitioning from one set of states to
another), easy to externalize for an immediate feedback ("state dump")
and to asses input-to-output transformation correctness (ad-hoc or
unit testing) to assist thinking in both low-level isolated realms and
in the higher-level architectural perspective (how the primitive
"components" fit together). Another way of thinking about this is
a directly observable "full state buffer", that would naturally tend
to prevent code-degrading on-the-fly and ad-hoc merging of what are
individual phases. Without deeper knowledge admittedly, I consider
this something that, for instance, LLVM project got intriguingly and
intrinsically right, and that's perhaps where to take a better
>> OCF could possibly be amended to allow for a similar semantic
>> indication of "stop to be reversed shortly on this very node if
>> things go well" if there was a tangible use case, say using
>> "stop-with-start-pending" action instead of "stop"
>> (and the amendment possibly building on an idea of addon profiles
>> https://github.com/ClusterLabs/OCF-spec/issues/17 if there was
>> an actual infrastructure for that and not just a daydreaming).
> I do not see how it is possible to shorthand resource restart. Cluster
> resource manager manages not isolated resources, but groups of
> interdependent resources. In general it is impossible to restart
> single resource without coordinate restart of multiple resources. And
> this should happen in defined order (you cannot "restart" mount point
> without stopping any user of it first).
> Moreover, restart is expected to clean up resources and actually
> result in pristine state. This is implicit assumption.
I tend to agree, but I am far from being a creative author of
resources agents or service life-cycle focused person. That was more
to cater hypothetical optimizations that were once considered, see the
referred scenario I linked up-thread:
(I dare not to evaluate the value it would bring or not).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 819 bytes
Desc: not available
More information about the Users