[ClusterLabs] One Failed Resource = Failover the Cluster?

Mon Jun 7 15:49:45 EDT 2021

> -----Original Message-----
> From: kgaillot at redhat.com <kgaillot at redhat.com>
> Sent: Monday, June 7, 2021 2:39 PM
> To: Strahil Nikolov <hunter86_bg at yahoo.com>; Cluster Labs - All topics
> related to open-source clustering welcomed <users at clusterlabs.org>; Eric
> Robinson <eric.robinson at psmnv.com>
> Subject: Re: [ClusterLabs] One Failed Resource = Failover the Cluster?
>
> On Sun, 2021-06-06 at 08:26 +0000, Strahil Nikolov wrote:
> > Based on the constraint rules you have mentioned , failure of mysql
> > should not cause a failover to another node. For better insight, you
> > have to be able to reproduce the issue and share the logs with the
> > community.
>
> By default, dependent resources in a colocation will affect the placement of
> the resources they depend on.
>
> In this case, if one of the mysql instances fails and meets its migration
> threshold, all of the resources will move to another node, to maximize the
> chance of all of them being able to run.
>

Which is what I don't want to happen. I only want the cluster to failover if one of the lower dependencies fails (drbd or filesystem). If one of the MySQL instances fails, I do not want the cluster to move everything for the sake of that one resource. That's like a teacher relocating all the students in the classroom to a new classroom because one of then lost his pencil.

> >
> > Best Regards,
> > Strahil Nikolov
> >
> > > On Sat, Jun 5, 2021 at 23:33, Eric Robinson
> > > <eric.robinson at psmnv.com> wrote:
> > > > -----Original Message-----
> > > > From: Users <users-bounces at clusterlabs.org> On Behalf Of
> > > > kgaillot at redhat.com
> > > > Sent: Friday, June 4, 2021 4:49 PM
> > > > To: Cluster Labs - All topics related to open-source clustering
> > > welcomed
> > > > <users at clusterlabs.org>
> > > > Subject: Re: [ClusterLabs] One Failed Resource = Failover the
> > > Cluster?
> > > >
> > > > On Fri, 2021-06-04 at 19:10 +0000, Eric Robinson wrote:
> > > > > Sometimes it seems like Pacemaker fails over an entire cluster
> > > when
> > > > > only one resource has failed, even though no other resources
> > > are
> > > > > dependent on it. Is that expected behavior?
> > > > >
> > > > > For example, suppose I have the following colocation
> > > constraints…
> > > > >
> > > > > filesystem with drbd master
> > > > > vip with filesystem
> > > > > mysql_01 with filesystem
> > > > > mysql_02 with filesystem
> > > > > mysql_03 with filesystem
> > > >
> > > > By default, a resource that is colocated with another resource
> > > will influence
> > > > that resource's location. This ensures that as many resources are
> > > active as
> > > > possible.
> > > >
> > > > So, if any one of the above resources fails and meets its
> > > migration- threshold,
> > > > all of the resources will move to another node so a recovery
> > > attempt can be
> > > > made for the failed resource.
> > > >
> > > > No resource will be *stopped* due to the failed resource unless
> > > it depends
> > > > on it.
> > > >
> > >
> > > Thanks, but I'm confused by your previous two paragraphs. On one
> > > hand, "if any one of the above resources fails and meets its
> > > migration- threshold, all of the resources will move to another
> > > node." Obviously moving resources requires stopping them. But then,
> > > "No resource will be *stopped* due to the failed resource unless it
> > > depends on it." Those two statements seem contradictory to me. Not
> > > trying to be argumentative. Just trying to understand.
> > >
> > > > As of the forthcoming 2.1.0 release, the new "influence" option
> > > for
> > > > colocation constraints (and "critical" resource meta-attribute)
> > > controls
> > > > whether this effect occurs. If influence is turned off (or the
> > > resource made
> > > > non-critical), then the failed resource will just stop, and the
> > > other resources
> > > > won't move to try to save it.
> > > >
> > >
> > > That sounds like the feature I'm waiting for. In the example
> > > configuration I provided, I would not want the failure of any mysql
> > > instance to cause cluster failover. I would only want the cluster to
> > > failover if the filesystem or drbd resources failed. Basically, if a
> > > resource breaks or fails to stop, I don't want the whole cluster to
> > > failover if nothing depends on that resource. Just let it stay down
> > > until someone can manually intervene. But if an underlying resource
> > > fails that everything else is dependent on (drbd or filesystem) then
> > > go ahead and failover the cluster.
> > >
> > > > >
> > > > > …and the following order constraints…
> > > > >
> > > > > promote drbd, then start filesystem start filesystem, then start
> > > > > vip start filesystem, then start mysql_01 start filesystem, then
> > > > > start mysql_02 start filesystem, then start mysql_03
> > > > >
> > > > > Now, if something goes wrong with mysql_02, will Pacemaker try
> > > to fail
> > > > > over the whole cluster? And if mysql_02 can’t be run on either
> > > > > cluster, then does Pacemaker refuse to run any resources?
> > > > >
> > > > > I’m asking because I’ve seen some odd behavior like that over
> > > the
> > > > > years. Could be my own configuration mistakes, of course.
> > > > >
> > > > > -Eric
> > > > --
> > > > Ken Gaillot <kgaillot at redhat.com>
> > > >
> > > > _______________________________________________
> > > > Manage your subscription:
> > > > https://lists.clusterlabs.org/mailman/listinfo/users
> > > >
> > > > ClusterLabs home: https://www.clusterlabs.org/
> > > Disclaimer : This email and any files transmitted with it are
> > > confidential and intended solely for intended recipients. If you are
> > > not the named addressee you should not disseminate, distribute, copy
> > > or alter this email. Any views or opinions presented in this email
> > > are solely those of the author and might not represent those of
> > > Physician Select Management. Warning: Although Physician Select
> > > Management has taken reasonable precautions to ensure no viruses are
> > > present in this email, the company cannot accept responsibility for
> > > any loss or damage arising from the use of this email or
> > > attachments.
> > >
> > > _______________________________________________
> > > Manage your subscription:
> > > https://lists.clusterlabs.org/mailman/listinfo/users
> > >
> > > ClusterLabs home: https://www.clusterlabs.org/
> >
> > _______________________________________________
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> >
> > ClusterLabs home: https://www.clusterlabs.org/
> --
> Ken Gaillot <kgaillot at redhat.com>

Disclaimer : This email and any files transmitted with it are confidential and intended solely for intended recipients. If you are not the named addressee you should not disseminate, distribute, copy or alter this email. Any views or opinions presented in this email are solely those of the author and might not represent those of Physician Select Management. Warning: Although Physician Select Management has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments.