[ClusterLabs] fence_virt architecture? (was: Re: Still Beginner STONITH Problem)

Mon Jul 20 04:45:40 EDT 2020

On 7/20/20 10:34 AM, Andrei Borzenkov wrote:
>
>
> On Mon, Jul 20, 2020 at 10:56 AM Klaus Wenninger <kwenning at redhat.com
> <mailto:kwenning at redhat.com>> wrote:
>
>     On 7/19/20 8:55 AM, Strahil Nikolov wrote:
>     > My understanding is that fence_xvm  is reaching each
>     Hypervisour  via multicast (otherwise why multicast ?)... yet I
>     could be simply fooling myself.
>     Not as far as I know. Otherwise external/libvirt would be able to
>     do the trick as well.
>     afaik it should be able to cope with multiple hosts listening on
>     the same multicast-address.
>
>
> No, it cannot. fence_virtd connects back to fence_xvm (or fence_virt)
> client (using its source address) and actual fencing request is then
> sent via this point to point channel. fence_xvm accepts only the first
> connection request. So even if multiple fecne_virtd will receive the
> initial multicast, only one of them will be able to connect to
> fence_xvm that sent this multicast.
>
> fence_virtd itself is dummy middleman. It accepts messages from
> fence_xvm/fence_virtd via the defined listener, connects back,
> receives actual fencing request and forwards it to the configured
> backend plugin. It performs no check itself whether the initial
> message is applicable - only the backend plugin knows what VM it can
> handle. And even if it did, because information is received only after
> reverse connection is established, it is already too late anyway.
> So if you are using muticast and your backend can only handle VM on
> local host, it is unpredictable which host will make reverse
> connection. Which is why RH KB article describes configuration with
> unique multicast address for each host. At which point you can simply
> use unicast address and spare all troubles associated with multicast.
>
> I do not see any use case for multicast. Either backend can fence
> every VM (i.e. backend is capable of contacting host that runs VM that
> has to be fenced) in which case you can simply contact local
> fence_virtd via e.g. vsock listener, or backend can handle only local
> VM (libvirt backend) in which case you must contact host where VM is
> running via unique address.
>  
>
>     But there were at least issues
>     with that - don't remember the details but remember a discussion
>     on some mailinglist - which is why I had suggested - as Reid just
>     repeated - quite in the beginning of this thread to go with a setup
>     that has 2 fencing-resources, 2 fence_virtd services listening on
>     different multicast addresses ...
>
>
> At which point you can just use tcp listener with normal unicast.
>  
>
>     The cpg-configuration sounds interesting as well. Haven't used
>     it or looked into the details. Would be interested to hear about
>     how that works.
>
>
> It maintains a registry of VM location (each fence_virtd polls local
> hypervisor at regular intervals) and forwards fencing request to
> appropriate host via corosync interconnect. It is also the only
> backend that can handle host failure - if it is known that host left
> cluster, any VM on this host is considered fenced by definition.
>
> It requires that hosts are configured in pacemaker cluster themselves
> (to handle host outage it must be properly fenced).
That sounds definitely interesting.
Are you saying that the hosts have to be pacemaker-nodes as well?
Otherwise we might be able to just add them to corosync and configure
them not to vote on quorum ...
... the same knet might then even be used to connect the bridges
on the hosts with each other on layer-2 ...
>
>  
>
>     >
>     > If the  VMs are behind NAT, I think that the simplest way to
>     STONITH  is to use SBD over iSCSI.
>     > Yet,  my KVM knowledge is  limited and I didn't see any proof
>     that I'm right (libvirt network was in NAT mode) or  wrong (VMs
>     using Host's  bond in a bridged network).
>     >
>     > Best Regards,
>     > Strahil Nikolov
>     >
>     > На 19 юли 2020 г. 9:45:29 GMT+03:00, Andrei Borzenkov
>     <arvidjaar at gmail.com <mailto:arvidjaar at gmail.com>> написа:
>     >> 18.07.2020 03:36, Reid Wahl пишет:
>     >>> I'm not sure that the libvirt backend is intended to be used
>     in this
>     >> way,
>     >>> with multiple hosts using the same multicast address. From the
>     >>> fence_virt.conf man page:
>     >>>
>     >>> ~~~
>     >>> BACKENDS
>     >>>    libvirt
>     >>>        The  libvirt  plugin  is  the  simplest  plugin.  It is
>     used
>     >> in
>     >>> environments where routing fencing requests between multiple
>     hosts is
>     >> not
>     >>> required, for example by a user running a cluster of virtual
>     >>>        machines on a single desktop computer.
>     >>>    libvirt-qmf
>     >>>        The libvirt-qmf plugin acts as a QMFv2 Console to the
>     >> libvirt-qmf
>     >>> daemon in order to route fencing requests over AMQP to the
>     >> appropriate
>     >>> computer.
>     >>>    cpg
>     >>>        The cpg plugin uses corosync CPG and libvirt to track
>     virtual
>     >>> machines and route fencing requests to the appropriate computer.
>     >>> ~~~
>     >>>
>     >>> I'm not an expert on fence_xvm or libvirt. It's possible that
>     this is
>     >> a
>     >>> viable configuration with the libvirt backend.
>     >>>
>     >>> However, when users want to configure fence_xvm for multiple hosts
>     >> with the
>     >>> libvirt backend, I have typically seen them configure multiple
>     >> fence_xvm
>     >>> devices (one per host) and configure a different multicast
>     address on
>     >> each
>     >>> host.
>     >>>
>     >>> If you have a Red Hat account, see also:
>     >>>   - https://access.redhat.com/solutions/2386421
>     >> What's the point in using multicast listener if every host will
>     have
>     >> unique multicast address and there will be separate stonith
>     agent for
>     >> each host using this unique address? That's not what everyone
>     expects
>     >> seeing "multicast" as communication protocol.
>     >>
>     >> This is serious question. If intention is to avoid TCP
>     overhead, why
>     >> not
>     >> simply use UDP with unique address? Or is single multicast address
>     >> still
>     >> possible and this article describes "what I once set up and it
>     worked
>     >> for me" and not "how it is designed to work"?
>     >>
>     >> Also what is not clear - which fence_virtd instance on host will be
>     >> contacted by stonith agent on cluster node? I.e. consider
>     >>
>     >> three hosts host1, host2, host3
>     >> three VM vm1, vm2, vm3 each active on corresponding host
>     >>
>     >> vm1 on host1 want to fence vm3 on host3. Will it
>     >> a) contact fence_virtd on host1 and fence_virtd on host1 will
>     forward
>     >> request to host3? Or
>     >> b) is it mandatory for vm1 to have connectivity to fence_virtd on
>     >> host3?
>     >>
>     >> If we combine existence of local-only listeners (like serial or
>     vsock)
>     >> and distributed backend (like cpg) it strongly suggests that vm1
>     >> -(listener)-> host1 -(backend)-> host3 -> -(fence)->vm3 is
>     possible.
>     >>
>     >> If each cluster node always directly contacts fence_virtd on
>     *target*
>     >> host then libvirt backend is still perfectly usable for multi-host
>     >> configuration as every fence_virtd will only ever fence local VM.
>     >>
>     >> Is there any high level architecture overview (may be
>     presentation from
>     >> some conference)?
>     >>
>     >>
>     >> _______________________________________________
>     >> Manage your subscription:
>     >> https://lists.clusterlabs.org/mailman/listinfo/users
>     >>
>     >> ClusterLabs home: https://www.clusterlabs.org/
>     > _______________________________________________
>     > Manage your subscription:
>     > https://lists.clusterlabs.org/mailman/listinfo/users
>     >
>     > ClusterLabs home: https://www.clusterlabs.org/
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20200720/7c09cd2c/attachment.htm>