[ClusterLabs] Antw: [EXT] Failed fencing monitor process (fence_vmware_soap) RHEL 8

Howard hmoneta at gmail.com
Thu Jun 18 18:13:32 EDT 2020


Thanks for all the help so far.  With your assistance, I'm very close to
stable.

Made the following changes to the vmfence stonith resource:

Meta Attrs: failure-timeout=30m migration-threshold=10
  Operations: monitor interval=60s (vmfence-monitor-interval-60s)

If I understand this correctly, it will check if the fencing device is
online every 60 seconds. It will try 10 times and then mark the node
ineligible.  After 30 minutes it will start trying again.

On Thu, Jun 18, 2020 at 12:29 PM Ken Gaillot <kgaillot at redhat.com> wrote:

> On Thu, 2020-06-18 at 21:32 +0300, Andrei Borzenkov wrote:
> > 18.06.2020 18:24, Ken Gaillot пишет:
> > > Note that a failed start of a stonith device will not prevent the
> > > cluster from using that device for fencing. It just prevents the
> > > cluster from monitoring the device.
> > >
> >
> > My understanding is that if stonith resource cannot run anywhere, it
> > also won't be used for stonith. When failcount exceeds threshold,
> > resource is banned from node. If it happens on all nodes, resource
> > cannot run anywhere and so won't be used for stonith. Start failure
> > automatically sets failcount to INFINITY.
> >
> > Or do I misunderstand something?
>
> I had to test to confirm, but a stonith resource stopped due to
> failures can indeed be used. Only stonith resources stopped via
> location constraints (bans) or target-role=Stopped are prevented from
> being used.
> --
> Ken Gaillot <kgaillot at redhat.com>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20200618/5e162258/attachment.htm>


More information about the Users mailing list