[ClusterLabs] Antw: [EXT] Coming in Pacemaker 2.1.0: noncritical resources

Fri Jan 22 11:13:16 EST 2021

On Fri, 2021-01-22 at 08:58 +0100, Ulrich Windl wrote:
> > > > Ken Gaillot <kgaillot at redhat.com> schrieb am 22.01.2021 um
> > > > 00:51 in
> 
> Nachricht
> <ffb31a4b0cc5ddfd2821d7a8d63961afcdce05d1.camel at redhat.com>:
> > Hi all,
> > 
> > A recurring request we've seen from Pacemaker users is a feature
> > called
> > "non‑critical resources" in a proprietary product and "independent
> > subtrees" in the old rgmanager project.
> > 
> > An example is a large database with an occasionally used reporting
> > tool. The reporting tool is colocated or grouped with the database.
> > If
> > the reporting tool fails enough times to meet its
> > migration‑threshold,
> > Pacemaker would traditionally move both resources to another node,
> > to
> > be able to keep them both running.
> 
> My opinion is "beware of the bloatware": Do we really need this?
> Maybe work on
> a more stable basement instead.

Yes, users have been asking for this for many years, and it's still a
common issue for people switching from other cluster software.

> Couldn't this be done with on-fail=block already? I mean: primarily
> the
> reporting tool should be fixed, and if it's not essential, it's seems
> OK that
> it won't start automatically after failure.

No, that would prevent the database from moving if the report failed.
The database should still be free to move for its own reasons.

> Also one may ask: If it's not essential, why does it run in a
> cluster?

In this example, to ensure it's colocated with the important resource.

Pacemaker does provide a number of features that are useful even
without clustering: monitoring and recovery attempts, complex ordering
relationships, standby/maintenance modes, rule-based behavior, etc.

The most common uses will probably be a lot like the example, with a
larger group, e.g. volume group -> filesystem -> database -> web server
-> not-so-important intranet tool. The user wants the
ordering/colocation relationships (and some attempts at recovery) but
doesn't want the less important thing to make everything else move if
it fails a bunch of times.

> Another alternative could be: Make the cluster define a cron job that
> starts
> the reporting tool if it's crashed. The cron job would follow the
> database.
> (Actually I implemented a similar thing)

Sure, but that loses other benefits like maintenance mode, and the
simplicity of one place to manage things

> > However, the database may be essential, and take a long time to
> > stop
> > and start, whereas the reporting tool may not be that important.
> > So,
> > the user would rather stop the reporting tool in the failure
> > scenario,
> > rather than cause a database outage to move both.
> > 
> > With the upcoming Pacemaker 2.1.0, this can be controlled with two
> > new
> > options.
> > 
> > Colocation constraints may take a new "influence" option that
> > determines whether the dependent resource influences the location
> > of
> > the main resource, if the main resource is already active. The
> > default
> > of true preserves the previous behavior. Setting it to false makes
> > the
> > dependent resource stop rather than move the main resource.
> > 
> > Resources may take a new "critical" meta‑attribute that serves as a
> > default for "influence" in all colocation constraints involving the
> > resource as the dependent, as well as all groups involving the
> > resource.
> > 
> > In our above example, either the colocation constraint could be
> > marked
> > with influence=false, or the reporting tool resource could be give
> > the
> > meta‑attribute critical=false, to achieve the desired effect.
> 
> I wonder: How would the cluster behave if the colocation score is
> zero?

Colocations with 0 score are ignored (this is consistent actually just
as of a few releases ago, before that they were ignored in some
respects and considered in other respects, which made their effect
difficult to predict)

> 
> Regards,
> Ulrich
> 
> > 
> > A big list of all changes for 2.1.0 can be found at:
> > 
> >  https://wiki.clusterlabs.org/wiki/Pacemaker_2.1_Changes 

-- 
Ken Gaillot <kgaillot at redhat.com>