[ClusterLabs] Ansible role to configure Pacemaker

Thu Jun 7 13:17:24 EDT 2018

On 07/06/18 11:08 -0400, Styopa Semenukha wrote:
> Thank you for your thoughts, Jan! I agree with the importance of the
> topics you raised, and I'd like to comment on them in the light of
> our project (and configuration management approach in general).
> 
> On 06/06/2018 08:26 PM, Jan Pokorný wrote:
>> On 07/06/18 02:19 +0200, Jan Pokorný wrote:
>>> While I see why Ansible is compelling, I feel it's important to
>>> challenge this trend of trying to bend/rebrand _machine-local
>>> configuration management tool_ as _distributed system management tool_
>>> (pacemaker is distributed application/framework of sorts), which Ansible
>>> alone is _not_, as far as I know, hence the effort doesn't seem to be
>>> 100% sound (which really matters if reliability is the goal).
>>> 
>>> Once more, this has nothing to do with the announced project, it's
>>> just the trending fuss on this topic that indicates me that people
>>> independently, as they keenly invent their own wheel (here: Ansible
>>> roles), get blind to the fallacy everything must work nicely with
>>> multi machine shared-state scenarios like they are used to with
>>> single host bootstrapping, without any shortcomings.
> 
> I can't entirely agree on this. The solution we're suggesting is
> built specifically to address this concern. In the taxonomy you
> linked, it would probably be type 2B, and here's why.

Thanks for the response.  Looks like there could have been a bit of
ignorance on my side wrt. the scope of your project.  I've meant
to address more general clashing pattern that I am observing, mostly
setting up cluster from multiple nodes simultaneously.

>>> But there are, and precisely because not the optimal tool for the
>>> task gets selected!  Just imagine what would happen if a single
>>> machine got configured independently with multiple Ansible actors
>>> (there may be mechanisms -- relatively easy within the same host --
>>> that would prevent such interferences, but assume now they are not
>>> strong enough).  What will happen?  Likely some mess-ups will occur as
>>> glorified idempotence is hard to achieve atomically.  Voila, inflicted
>>> race conditions, one by one, get exercised, until there's enough of
>>> bad luck that the rule of idempotence gets broken, just because of
>>> these processes emulating a schizophrenic (at the same time
>>> multitasking) admin.  Ouch!
> 
> This situation is actually altering the rules of the game as we
> play.  Configuration management is a technical solution, it was
> never meant to solve administrative (i.e. human-centered) problems.
> No atomicity will safeguard us from another admin deciding to reboot
> the hypervisor with my host. Idempotence is a relative concept, and
> it's relative to one person/entity. If I run the same playbook
> again, any time, any number of times, the result will be the same.
> 
> However, if another actor is involved, unsurprisingly, their mileage
> will vary, and so will mine. What happens if two admins add the same
> host to two Kubernetes/Heat/Ansible environments? That's the same
> situation. And I'm not even trying to solve this type of situation.

See the above note.  You may not have a cluster-wide multi-sourced
run of Ansible routines on mind, but others may (not sure, it may
actually be a common flow: single recipe x multiple machines), and
that's where slippery slope starts.  And contrary, using single-sourced
run makes your automation non-HA, which again, may be acceptable or
not.

>>> Now, reflect This to the situation with possibly concurrent
>>> cluster configuration.  One cannot really expect the cluster
>>> stack to be bullet-proof against these sorts of mishandling.
>>> Single cluster administrator operating at a time?  Ideal!
>>> Few administrators presumably with separate areas of
>>> configuration interest?  Pacemaker is quite ready.
>>> Cluster configuration randomly touched from random node
>>> at random time (equivalent of said schizophrenic multitasking
>>> administrator with a single host)?  Chances are off in
>>> sufficiently long period when this happens.
>>> 
>>> The solution here is to break that randomness, configuration
>>> is modified either:
>>> 1. from a single node at a time in the cluster (plus preferrably
>>>    batching all required changes into a single request)
>>> 2, mutual time-critical exclusion of triggering the changes
>>>    across the nodes
>>> 3. mutual locality-critical exclusion in the subject of the
>>>    changes initiated from particular nodes
>>> 
>>> Putting 1. and 3. aside as not very interesting (1. means
>>> a degenerate case with single point of failure, and 3. kills
>>> the universality), what we get is really a dependency on some
>>> kind of distributed lock and/or transactional system.
>>> Well, we have just discovered that what we need to automate our
>>> predestined configuration in the cluster reliably and without
>>> hurting universality (like "breaking the node symmetry") is
>>> said distributed system management ("orchestration") tool.
>>> Has Ansible these capabilities?
> 
> Correct, all these capabilities are already there, let me explain.
> 
> Firstly, as you pointed out in #1, the CIB configuration section is
> run on a single node, Ansible's `run_once` makes sure of that.

This presumes single and only initiating source, right?
That might be sufficient, admittedly.  But then, there's another 
problem:  how do you make sure you'll select a healthy, quorate
node to control the cluster on your/playbook's behalf?

Moreover, using just cibadmin command may be fine, but putting other
CLI commands into the mix may no guarantee full synchonicity (ACID
like transactional behaviour) that may be expected in case of
single-sourced run, simply because of generally asynchronous
arrangement of multiple parts of pacemaker.  Be warned!  

> Additionally all required changes *are* in fact batched into a
> single reqest: as I mentioned, changes are made to an XML dump,
> which gets verified and pushed to the cluster using the
> vendor-approved method (cibadmin --replace).  

Just FYI, there's an alternative approach akin to #3, using
crm_diff and cibadmin --patch, which makes a closure over what's
to be changed, hence making it local and more likely non-interfering
with the other changes made to cluster in the interim.  But that's
just a probability play, no strong guarantees are implied.

> Secondly, as you suggest in #2, CIB schema has this feature built
> in, it's `admin_epoch` property. The cluster will reject XML older
> than the one it runs. And our role makes sure it gets incremented
> whenever changes are made.  Therefore, if other (valid) changes have
> been made, the playbook will fail until you rerun it without
> conflicts. Pretty much like Git requiring you to rebase/merge before
> you push.

Yes, in case of multi-sourced (semi-distributed) run, one change is
always going to win and the other, serialized later, lose, unless it's
patch-based modification.

>>> Now, one idea there might be to make the tools like pcs compensate
>>> for these shortcomings of machine-local configuration management ones.
>>> Sounds good, right?  Absolutely not, more like a bad joke!
>>> Because what else can it be, the development of orchestration-like
>>> features (with all the complexities solved once in corosync/DLM
>>> already; relaxing non-dependency on the very subject of management
>>> may not be wise) on top of regular high-level cluster management tool
>>> only[*] to bridge the gap in something that is simply subpar fit
>>> in distributed environments to begin with?
> 
> In my understanding pcs was designed to make manual configuration more
> user-friendly, not as an orchestration tool.

Exactly; both this and crmsh may add some slight improvements towards
mentioned single-sourced synchronicity (e.g. --wait with pcs) and
elsewhere, although can be source of new classes of issues, like when
you conveniently omit ID of some entity, new unique one can get chosen,
resulting in redundancy that may also have unintended effects.
But in this case, the rule of idempotence is clearly broken
-- who will realize this ahead of time, is questionable, though.

And still, 100% non-clashing automated and semi-distributed
configuration management can hardly be achieved without having
additional restrictions (like #1-3, plus things like double-checking
the intended effect [and nothing more] was propagated before
proceeding further) in place.  Orchestration tool is what might
help here, or one is on her own, with Ansible or not.

Speaking of that, there's a project with very interesting focal points
in the configuration management space (though it may slightly overlap
with the mission of pacemaker), like reactive behaviour, embracing the
distributed operation, and more.  The downside is that it's still
immature.  Link: https://github.com/purpleidea/mgmt

> Anyhow, I do appreciate your opinion and agree with the overall idea on
> orchestration/configuration problem. Thank you for the insights.

I just don't want people to be eventually dissapointed about
compromised _high availability_ when they get a false impression
upstream fully approves arbitrary combinations of configuration
management automation, especially machine-local ones like Ansible,
with (inherently state-sharing, would be non-issue if this wasn't
the case, but then, it would be "just bunch of machines" :-)
cluster deployments just because no concerns get raised.

Discretion is needed, this is quite a trap field, and we cannot
talk about truly universal solutions unless cluster-wide
restrictions get enforced (orchestration and beyond).

-- 
Poki
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20180607/77100ae2/attachment-0002.sig>