[Pacemaker] Concurrent runs of 'crm configure primitive' interfering

Dejan Muhamedagic dejanmm at fastmail.fm
Mon Oct 3 12:58:15 EDT 2011


On Wed, Sep 28, 2011 at 10:52:16AM -0400, Brian J. Murrell wrote:
> On 11-09-28 10:20 AM, Dejan Muhamedagic wrote:
> > Hi,
> 
> Hi,
> 
> > I'm really not sure. Need to investigate this area more.
> 
> Well, I am experimenting with cibadmin.  It's certainly not as nice and
> shiny as crm shell though.  :-)
> 
> > cibadmin talks to the cib (the process) and cib should allow
> > only one writer at the time.
> 
> Good.  That's needed of course.  But what does it do with other
> attempting writers?  Do they block until the CIB is available to write
> or do they turn their attempted writers away in error?

Hmm, don't know.

> > The shell keeps the changes in its memory until the user says
> > commit (or if it's a single-shot configure command). Just before
> > doing the commit, it checks (using cibadmin) if the CIB changed
> > in the meantime (i.e. since it was last time loaded or refreshed
> > in crm) and if so it refuses to commit changes.
> 
> Ahhhh.

Hope that everything is OK over there :)

> > That is,
> > _unless_ it is forced to do so. So, if you use the -F option,
> > one crm instance is likely to override changes of another crm
> > instance or, for that matter, of anybody else.
> 
> But is crm writing (i.e. replacing) entire CIBs or just updating
> fragments of it, like the resources and constraints, etc. it's being
> asked to operate on by the user?

Entire CIB. It used to do only changed elements, but then
everybody agreed that it is too complex to keep dependencies
satisfied at all times.

> If the the latter, then two crm instances that are forced to write
> non-overlapping fragments should result in both being successful, if the
> cib is locking out concurrent cibadmin writers the way it should be, yes?

Yes, but there could be a time frame when crm thinks that it can
write the configuration. However, if the epoch changed in the
meantime, then that write should fail.

> > In short, having more than one crm instance trying to modify the
> > configuration simultaneously probably won't give good results.
> 
> As long as they are making non-colliding changes, shouldn't they both be
> successful?

crm writes the whole CIB.

> > And the matter is simple: If the cluster CIB changed since the
> > crm itself accepted configuration modifications, there's no way
> > to say which changes should take precedence and there's no
> > obvious way to merge the changes coming from two different
> > sources.
> 
> Indeed, assuming they conflict.  But if they don't, there shouldn't be
> any problem with two crms working on independent resources and
> constraints, yes?

If you give me a patch which makes sure that the CIB change in
the meantime doesn't affect the change done by the user in crm,
perhaps I'll consider applying it. I guess that that is possible
since the shell keeps track of which elements changed, though
not in which way did they change. Then we'd need to switch back
again to applying smaller changes to the CIB. If that is
possible at all. At any rate, it's quite an undertaking. Now,
this may be getting too far... CIB was not meant to be a real
distributed database.

> > What's your use case?
> 
> We're using tools to drive HA configuration where those tools go out to
> the various nodes in the cluster and perform configuration tasks,
> possibly and probably in parallel, one of which is to issue the crm
> commands to configure the resources and constraints that that node will
> primarily be responsible for.

Well, that may be a good use case, but we may not be that well
equipped for such a scenario. Why not do all the changes on
one node? Shouldn't it have all the information it needs? Or are
the configuration commands issued based on some other local
state?

Thanks,

Dejan

> b.
> 



> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker





More information about the Pacemaker mailing list