[ClusterLabs] Antw: Re: Antw: [EXT] Re: Disable all resources in a group if one or more of them fail and are unable to reactivate

Tue Feb 2 02:08:12 EST 2021

>>> Ken Gaillot <kgaillot at redhat.com> schrieb am 01.02.2021 um 17:07 in
Nachricht
<74ab971aa8450a45099f27ab738fae911c7c7b8d.camel at redhat.com>:
> That's a new one to me. I'm shocked that works ... I'd expect it to be
> detected as a colocation loop and ignored.

I'd also think of it as (an ugly) work-around instead of a "solution".

> 
> On Mon, 2021-02-01 at 12:15 +0100, damiano giuliani wrote:
>> Hi Guys, sorry for the late answer, today i had the time to test the
>> Igor's solution and it works flawlessy.
>> creating a colocation constraint , binding the first and the last
>> group resources with an INFINITY score make possible to "If at least
>> one resource in the group fails the group will fail all resources."
>> 
>> THe Igor's explanation clarify everything to me.
>> 
>> adding this line works for me:
>> 
>> pcs constraint colocation add lta-subscription-backend-ope-s1 with
>> s3srvnotificationdispatcher INFINITY
>> 
>> I would thanks everyone helped me and spend his time.
>> 
>> Have a good Week!
>> 
>> Best
>> 
>> Damian
>> 
>> Il giorno ven 29 gen 2021 alle ore 11:22 Ulrich Windl <
>> Ulrich.Windl at rz.uni-regensburg.de> ha scritto:
>> > >>> Andrei Borzenkov <arvidjaar at gmail.com> schrieb am 28.01.2021 um
>> > 18:30 in
>> > Nachricht <db12df26-6cc4-bad2-8bf5-8ee3aad87533 at gmail.com>:
>> > > 27.01.2021 22:03, Ken Gaillot пишет:
>> > >> 
>> > >> With a group, later members depend on earlier members. If an
>> > earlier
>> > >> member can't run, then no members after it can run.
>> > >> 
>> > >> However we can't make the dependency go in both directions. If
>> > an
>> > >> earlier member can't run unless a later member is active, and
>> > vice
>> > >> versa, then how can anything be started?
>> > >> 
>> > >> By default, Pacemaker tries to recover failed resources on the
>> > same
>> > >> node, up to its migration-threshold (which defaults to a million
>> > >> times). Once a group member reaches its migration-threshold,
>> > Pacemaker
>> > >> will move the entire group to another node if one is available.
>> > However
>> > >> if no node is available for the failed member, then it will just
>> > remain
>> > >> stopped (along with any later members in the group), and the
>> > earlier
>> > >> members will stay active where they are.
>> > >> 
>> > >> I don't think there's any way to prevent earlier members from
>> > running
>> > >> if a later member has no available node.
>> > >> 
>> > > 
>> > > All other HA managers I am aware of have collection of resources
>> > (often
>> > > called "application") as scheduling unit. All resources in one
>> > > collection are automatically activated on the same node (they of
>> > course
>> > > (may) have ordering dependencies). If any required resource in
>> > > collection fails, partially active collection is cleaned up, all
>> > > resources activated so far are deactivated. This is indeed
>> > virtually
>> > > impossible to express in pacemaker. The only way I can think of
>> > is
>> > > artificially restrict management layer to top-level resources,
>> > but this
>> > > also won't work for stopping group of resources (where "group" is
>> > used
>> > > generically, not in narrow pacemaker sense) for reasons you
>> > explained.
>> > 
>> > I just wonder: Adding op timeouts to a group?
>> > If the groups fails to start or stop within the specific time,
>> > consider the
>> > whole group as failed...
>> > stop a failed start, and fence a failed stop...
>> > 
>> > Regards,
>> > Ulrich
>> > 
>> > > 
>> > > _______________________________________________
>> > > Manage your subscription:
>> > > https://lists.clusterlabs.org/mailman/listinfo/users 
>> > > 
>> > > ClusterLabs home: https://www.clusterlabs.org/ 
>> > 
>> > 
>> > 
>> > _______________________________________________
>> > Manage your subscription:
>> > https://lists.clusterlabs.org/mailman/listinfo/users 
>> > 
>> > ClusterLabs home: https://www.clusterlabs.org/ 
>> 
>> _______________________________________________
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users 
>> 
>> ClusterLabs home: https://www.clusterlabs.org/ 
> -- 
> Ken Gaillot <kgaillot at redhat.com>
> 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/