[ClusterLabs] Antw: [EXT] Failed fencing monitor process (fence_vmware_soap) RHEL 8

Klaus Wenninger kwenning at redhat.com
Fri Jun 19 06:23:08 EDT 2020


On 6/19/20 12:13 AM, Howard wrote:
> Thanks for all the help so far.  With your assistance, I'm very close
> to stable.
>
> Made the following changes to the vmfence stonith resource:
>   
> Meta Attrs: failure-timeout=30m migration-threshold=10
>   Operations: monitor interval=60s (vmfence-monitor-interval-60s)
>
> If I understand this correctly, it will check if the fencing device is
> online every 60 seconds. It will try 10 times and then mark the node
> ineligible.  After 30 minutes it will start trying again.
>
> On Thu, Jun 18, 2020 at 12:29 PM Ken Gaillot <kgaillot at redhat.com
> <mailto:kgaillot at redhat.com>> wrote:
>
>     On Thu, 2020-06-18 at 21:32 +0300, Andrei Borzenkov wrote:
>     > 18.06.2020 18:24, Ken Gaillot пишет:
>     > > Note that a failed start of a stonith device will not prevent the
>     > > cluster from using that device for fencing. It just prevents the
>     > > cluster from monitoring the device.
>     > >
>     >
>     > My understanding is that if stonith resource cannot run anywhere, it
>     > also won't be used for stonith. When failcount exceeds threshold,
>     > resource is banned from node. If it happens on all nodes, resource
>     > cannot run anywhere and so won't be used for stonith. Start failure
>     > automatically sets failcount to INFINITY.
>     >
>     > Or do I misunderstand something?
>
>     I had to test to confirm, but a stonith resource stopped due to
>     failures can indeed be used. Only stonith resources stopped via
>     location constraints (bans) or target-role=Stopped are prevented from
>     being used.
>
Unfortunately this could be a bit tricky to test as fenced updates
the device-list on configuration changes but scores as well influence
if a device is taken into that list.
So there is as well a possible dependency on when the device-list has been
updated most recently.
Don't know if it is relevant for this config but unfortunately something
to have in the back of one's mind in case of more complex fencing
setups.
An uglyness that is known for a long time but there is no easy way
to solve the issue without loosing part of the independence and with
that robustness of the fencing subsystem.

Klaus
>
>     -- 
>     Ken Gaillot <kgaillot at redhat.com <mailto:kgaillot at redhat.com>>
>
>     _______________________________________________
>     Manage your subscription:
>     https://lists.clusterlabs.org/mailman/listinfo/users
>
>     ClusterLabs home: https://www.clusterlabs.org/
>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20200619/46a3795c/attachment.htm>


More information about the Users mailing list