[ClusterLabs] Ansible role to configure Pacemaker

Jan Pokorný jpokorny at redhat.com
Fri Jun 8 09:52:38 EDT 2018

On 07/06/18 17:57 +0100, Adam Spiers wrote:
> Jan Pokorný <jpokorny at redhat.com> wrote:
>> While I see why Ansible is compelling, I feel it's important to
>> challenge this trend of trying to bend/rebrand _machine-local
>> configuration management tool_ as _distributed system management tool_
>> (pacemaker is distributed application/framework of sorts), which Ansible
>> alone is _not_, as far as I know, hence the effort doesn't seem to be
>> 100% sound (which really matters if reliability is the goal).
> I'm not sure I understand.  Are you saying Ansible is a machine-local
> configuration management tool not a distributed system management
> tool?  Because I don't think that statement is accurate; Ansible was
> absolutely designed from the beginning for orchestrating config
> management over multiple machines (unlike Chef or Puppet).  But as a
> RH employee you must know that already, so I'm probably missing
> something ;-)

I know as little as anyone who don't care much about that surface,
so yes, I may be inaccurate and will gladly stand corrected and
enlighted.  In part, I take a role of devil's advocate here, it
is in line with precaution-full approach that I'd suggest to anyone
taking HA seriously, and at worst, I'll end up being just overly
pessimistic (feel free to shame me, then ;-)
But without the likeminded, we wouldn't be using seatbelts...

>> Once more, this has nothing to do with the announced project, it's
>> just the trending fuss on this topic that indicates me that people
>> independently, as they keenly invent their own wheel (here: Ansible
>> roles), get blind to the fallacy everything must work nicely with
>> multi machine shared-state scenarios like they are used to with
>> single host bootstrapping, without any shortcomings.
> Ansible is not intended purely for single-host bootstrapping.
> But again I'm sure you already know that, so I'm a bit confused what
> your point is here.

By counterexample, is it then fully qualified to control distributed
systems where the holistic knowledge about the cluster partition would
be inherently present in the equations?

>> But there are, and precisely because not the optimal tool for the
>> task gets selected!  Just imagine what would happen if a single
>> machine got configured independently with multiple Ansible actors
>> (there may be mechanisms -- relatively easy within the same host --
>> that would prevent such interferences, but assume now they are not
>> strong enough).
> ICBW but it sounds you are imagining a problem which isn't always
> there, and even when it is there, it's not big enough to justify
> chucking away the other benefits of automating deployment of Pacemaker
> via something like Ansible.  In other words, don't throw the baby out
> with the bathwater[0].
> [0] https://en.wikipedia.org/wiki/Don%27t_throw_the_baby_out_with_the_bathwater

Sorry if I was understood like that, I am just afraid the possible
shortcomings of automation (presumably not directly attended) like that
are not very apparent to anyone who would just pick allegedly
"stock solution for my task", and in HA, everyone should tread
especially lightly as mentioned.

> For example I work on a product which uses Ansible running from a
> central node to deploy clusters.  By virtue of the documented contract
> with the customer about what deployment / maintenance procedures are
> supported, we can assume that only one Ansible actor will ever run
> concurrently.  If we are worried that the customer will ignore the
> documentation and take actions we don't support, we can implement some
> kind of simple locking on the deployer node and that's plenty good
> enough.  And yes, this makes the deployer node a SPoF, but again there
> are perfectly acceptable and simple ways to mitigate that issue
> (briefly: make it easy to turn any node into the deployer).

Documenting limitations is vital, and I don't have a single bit against
solutions that underwent such scrutiny to prevent surprises.

> So whilst the concerns you write about here are potentially
> correct from a theoretical perspective, in the real world they are
> most likely not strong enough to prevent us from being interested in
> using (say) Ansible to deploy Pacemaker.
>> What will happen?  Likely some mess-ups will occur as
>> glorified idempotence is hard to achieve atomically.  Voila, inflicted
>> race conditions, one by one, get exercised, until there's enough of
>> bad luck that the rule of idempotence gets broken, just because of
>> these processes emulating a schizophrenic (at the same time
>> multitasking) admin.  Ouch!
>> Now, reflect This to the situation with possibly concurrent
>> cluster configuration.  One cannot really expect the cluster
>> stack to be bullet-proof against these sorts of mishandling.
>> Single cluster administrator operating at a time?  Ideal!
>> Few administrators presumably with separate areas of
>> configuration interest?  Pacemaker is quite ready.
>> Cluster configuration randomly touched from random node
>> at random time (equivalent of said schizophrenic multitasking
>> administrator with a single host)?  Chances are off in
>> sufficiently long perioud when this happens.
>> The solution here is to break that randomness, configuration
>> is modified either:
>> 1. from a single node at a time in the cluster (plus preferrably
>>  batching all required changes into a single request)
>> 2, mutual time-critical exclusion of triggering the changes
>>  across the nodes
>> 3. mutual locality-critical exclusion in the subject of the
>>  changes initiated from particular nodes
> It's hard to know exactly what you mean by case 3 here.

Concurrent changes that have no mutual interference whatsover,
- adding a node attribute that nothing depends on (yet)
- increasing number of demoted instances of particular resource

It was meant as a hypothetical solution where nodes are
assigned domains of configuration isolated like this, then
no other forms of mutual exclusion are needed with "configuration
patching" approach.

But the fact is that almost every change will in a way or another
influence other aspects (meaning that the order matters),
hence "hypothetical".

>> Putting 1. and 3. aside as not very interesting (1. means
>> a degenerate case with single point of failure
> I don't think it has to mean that.  It's possible to ensure that
> config is only changed from one node at a time via a tool such as
> Ansible, without hardcoding that to the same node every time.

The semantics is crucial, "config changed" can mean
- a. config change initiated
- b. config change fully delivered to other local pacemaker daemons
- c. config change distributed cluster-wide to respective peer
     daemons (cib/pacemaker-based)
- d. as c. but with per-node finalization as in b.
- ...
- z. any of the previous but without effect on the quorate partition
     (which is what "cluster" refers to at that time) because of
     talking to the node in the wrong cluster partition

Of course, only d. means proper node-based mutual exclusion and only
then the follow-up would be free to execute from arbitrary node
(in same regime).

And if there's no retry-elsewhere logic for when the config change
propagation won't reach c. (and/or no check that indeed the right node
is being communicated with - z.), it's a SPOF in my eyes.

>> and 3. kills
>> the universality), what we get is really a dependency on some
>> kind of distributed lock and/or transactional system.
>> Well, we have just discovered that what we need to automate our
>> predestined configuration in the cluster reliably and without
>> hurting universality (like "breaking the node symmetry")
> What do you mean by node symmetry and why is it important?

It's the case where only a single node is predestined to be the
entry point to the whole cluster for the whole time the configuration
management executes.  View of the shared state from this node is
not necessarily representative compared to the objective state of
the cluster.  But as mentioned, it may suffice to say, "this playbook
only deals with subjective cluster shared state as viewed from
predestined node, do not confuse it with the actual one as held
by the quorate nodes".

>> is said distributed system management ("orchestration") tool.
>> Has Ansible these capabilities?
> I'm struggling to understand exactly what you mean, but yes I think it
> probably does.
>> Now, one idea there might be to make the tools like pcs compensate
>> for these shortcomings of machine-local configuration management ones.
>> Sounds good, right?  Absolutely not, more like a bad joke!
>> Because what else can it be, the development of orchestration-like
>> features (with all the complexities solved once in corosync/DLM
>> already; relaxing non-dependency on the very subject of management
>> may not be wise) on top of regular high-level cluster management tool
>> only[*] to bridge the gap in something that is simply subpar fit
>> in distributed environments to begin with?
>> As Czech proverb puts it: think twice, act once.
> Here you seem to be assuming that an Ansible Pacemaker role would have
> to be used in a automated, fully orchestrated scenario where cluster
> config is being managed by multiple nodes in a way which requires some
> complex consensus model.  Can you give an example of why would anyone
> need to do that?

That was rather a mental over-approximation based on lack of explicit
limitations.  I can't really imagine what combinations could be
encountered, accidentally or in a naive good faith.

>> [*] non-automated/human-triggered usage is generally fine as it's
>>   highly unlikely none of 1.-3. would be satisfied, so there
>>   would be next to no gain for these workflows
> OK, so maybe we are agreed after all.  But if you acknowledge that
> manually triggered usage is generally safe, then perhaps you shouldn't
> also assume that "people [...] get blind to the fallacy everything
> must work nicely with multi machine shared-state scenarios" ;-)

Perhaps the main point I want to make is:
So far, mostly fully manual (or semi-manual if crmsh/pcs is used)
deployments and configuration management worked well, because
- timing was relaxed, order of events hence was relatively stable
- mutual exclusion as in 1. - 3. was naturally present
- one was possibly ready to react whenever something did not fit

Now, configuration management tools enter the scene and people
want to use the very same procedures in an automated way to conquer
such a distributed system, consequently pushing timing, ordering and
possibly concurrency to the limits, and my gut feeling is that it
cannot be performed just in this naive way if it shall work reliably
(I am positive there are cases it'll break, call it design flaws
in the infrastructure code, perhaps sometimes rightfully, but such
on the edge usage was likely never considered or just neglected).

And it'll become a real problem if noone knowledgable is around to
react (which is IMHO current automation/devops trend, incl. task
solving just with reusing publicly shared routines mindlessly, or
even mindfully but without detailed limitations the outcome may
be the same).

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20180608/49356cd1/attachment-0002.sig>

More information about the Users mailing list