[ClusterLabs] Interleaving clones with different number of instances per node

Ken Gaillot kgaillot at redhat.com
Thu Oct 17 16:21:34 UTC 2024


On Thu, 2024-10-17 at 17:51 +0200, Jochen wrote:
> Thanks very much for the info and the help.
> 
> For my problem I have decided to use systemd and its dependency
> management to create an "instance" service for each clone instance,
> that then starts the "main" service on a node if any instance service
> is started. Hopefully works without any coding required...
> 
> I still might have a try at extending the attribute resource. One
> question though: Is incrementing and decrementing the count robust
> enough? I would think a solution that actually counts the running
> instances each time so we don't get any drift would be preferable.
> But what would be the best way for the agent to get this count?

Incrementing and decrementing should be sufficient in general. If an
instance crashes without decrementing the count, Pacemaker will stop it
as part of recovery.

The main opportunity for trouble would be an instance started outside
Pacemaker control. Pacemaker would detect it and either stop it
(decrementing when we shouldn't) or leave it alone (not incrementing
when we should).

To count each time instead, probably the best way would be to look for
state files with instance numbers.

> 
> > On 17. Oct 2024, at 16:50, Ken Gaillot <kgaillot at redhat.com> wrote:
> > 
> > On Thu, 2024-10-17 at 16:34 +0200, Jochen wrote:
> > > Thanks for the help!
> > > 
> > > Before I break out my editor and start writing custom resource
> > > agents, one question: Is there a way to use a cloned
> > > ocf:pacemaker:attribute resource to set a clone-specific
> > > attribute on
> > > a node? I.e. attribute "started-0=1" and "started-1=1", depending
> > > on
> > > the clone ID? For this I would need e.g. a rule to configure a
> > > clone
> > > specific resource parameter, or is there something like variable
> > > substitution in resource parameters?
> > 
> > No, ocf:pacemaker:attribute won't work properly as a unique clone.
> > If
> > one instance is started and another stopped, it will get the status
> > of
> > one of them wrong.
> > 
> > I noticed that yesterday and came up with an idea for a general
> > solution if you feel like tackling it:
> > 
> > https://projects.clusterlabs.org/T899
> > 
> > > > On 16. Oct 2024, at 16:22, Ken Gaillot <kgaillot at redhat.com>
> > > > wrote:
> > > > 
> > > > On Mon, 2024-10-14 at 18:49 +0200, Jochen wrote:
> > > > > Hi, I have two cloned resources in my cluster that have the
> > > > > following
> > > > > properties:
> > > > > 
> > > > > * There are a maximum of two instances of R1 in the cluster,
> > > > > with
> > > > > a
> > > > > maximum of two per node
> > > > > * When any instance of R1 is started on a node, exactly one
> > > > > instance
> > > > > of R2 should run on that node
> > > > > 
> > > > > When I configure this, and verify the configuration with
> > > > > "crm_verify
> > > > > -LV", I get the following error:
> > > > > 
> > > > > clone_rsc_colocation_rh) 	error: Cannot interleave R2-
> > > > > clone and
> > > > > R1-clone because they do not support the same number of
> > > > > instances
> > > > > per
> > > > > node
> > > > > 
> > > > > How can I make this work? Any help would be greatly
> > > > > appreciated.
> > > > 
> > > > Hi,
> > > > 
> > > > I believe the number of instances has to be the same because
> > > > each
> > > > instance pair on a single node is interleaved.
> > > > 
> > > > There's no direct way to configure what you want, but it might
> > > > be
> > > > possible with a custom OCF agent for R1 and attribute-based
> > > > rules.
> > > > 
> > > > On start, the R1 agent could set a custom node attribute to
> > > > some
> > > > value.
> > > > On stop, it could check whether any other instances are active
> > > > (assuming that's possible), and if not, clear the attribute.
> > > > Then,
> > > > R2
> > > > could have a location rule enabling it only on nodes where the
> > > > attribute has the desired value.
> > > > 
> > > > R2 wouldn't stop until *after* the last instance of R1 stops,
> > > > which
> > > > could be a problem depending on the particulars of the service.
> > > > There
> > > > might also be a race condition if two instances are stopping at
> > > > the
> > > > same time, so it might be worthwhile to set ordered=true on the
> > > > clone.
> > > > 
> > > > > Current configuration is as follows:
> > > > > 
> > > > > <resources>
> > > > > <clone id="R1-clone">
> > > > >   <meta_attributes id="R1-clone-meta_attributes">
> > > > >     <nvpair name="interleave" value="true" id="R1-clone-
> > > > > meta_attributes-interleave"/>
> > > > >     <nvpair name="resource-stickiness" value="0" id="R1-
> > > > > clone-
> > > > > meta_attributes-resource-stickiness"/>
> > > > >     <nvpair name="clone-max" value="2" id="R1-clone-
> > > > > meta_attributes-clone-max"/>
> > > > >     <nvpair name="clone-node-max" value="2" id="R1-clone-
> > > > > meta_attributes-clone-node-max"/>
> > > > >     <nvpair name="globally-unique" value="true" id="R1-clone-
> > > > > meta_attributes-globally-unique"/>
> > > > >     <nvpair name="failure-timeout" value="300" id="R1-clone-
> > > > > meta_attributes-failure-timeout"/>
> > > > >     <nvpair name="migration-threshold" value="0" id="R1-
> > > > > clone-
> > > > > meta_attributes-migration-threshold"/>
> > > > >     <nvpair id="R1-clone-meta_attributes-target-role"
> > > > > name="target-
> > > > > role" value="Stopped"/>
> > > > >   </meta_attributes>
> > > > > </clone>
> > > > > <clone id="R2-clone">
> > > > >   <meta_attributes id="R2-clone-meta_attributes">
> > > > >     <nvpair name="interleave" value="true" id="R2-clone-
> > > > > meta_attributes-interleave"/>
> > > > >     <nvpair name="resource-stickiness" value="0" id="R2-
> > > > > clone-
> > > > > meta_attributes-resource-stickiness"/>
> > > > >     <nvpair name="failure-timeout" value="300" id="R2-clone-
> > > > > meta_attributes-failure-timeout"/>
> > > > >     <nvpair name="migration-threshold" value="0" id="R2-
> > > > > clone-
> > > > > meta_attributes-migration-threshold"/>
> > > > >     <nvpair id="R2-clone-meta_attributes-target-role"
> > > > > name="target-
> > > > > role" value="Stopped"/>
> > > > >   </meta_attributes>
> > > > > </clone>
> > > > > </resources>
> > > > > 
> > > > > <constraints>
> > > > > <rsc_order id="R1-order" kind="Mandatory" first="R1-clone"
> > > > > then="R2-clone"/>
> > > > > <rsc_colocation id="R2-clone-colocation" score="INFINITY"
> > > > > rsc="R2-
> > > > > clone" with-rsc="R1-clone"/>
> > > > > </constraints>
> > > > > 
> > > > > 
> > > > -- 
> > > > Ken Gaillot <kgaillot at redhat.com>
> > > > 
> > > > _______________________________________________
> > > > Manage your subscription:
> > > > https://lists.clusterlabs.org/mailman/listinfo/users
> > > > 
> > > > ClusterLabs home: https://www.clusterlabs.org/
> > > 
> > > _______________________________________________
> > > Manage your subscription:
> > > https://lists.clusterlabs.org/mailman/listinfo/users
> > > 
> > > ClusterLabs home: https://www.clusterlabs.org/
> > -- 
> > Ken Gaillot <kgaillot at redhat.com>
> > 
> > _______________________________________________
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> > 
> > ClusterLabs home: https://www.clusterlabs.org/
> 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot <kgaillot at redhat.com>



More information about the Users mailing list