[ClusterLabs] Still Beginner STONITH Problem

Mon Jul 20 07:51:37 EDT 2020

Am 20.07.2020 um 13:36 schrieb Klaus Wenninger:
> On 7/20/20 1:10 PM, Stefan Schmitz wrote:
>> Hello,
>>
>> thank you all very much for your help so far!
>>
>> We have no managed to capture the mulitcast traffic originating from
>> one host when issuing the command "fence_xvm -o list" on the other
>> host. Now the tcpdump at least looks exactly the same on all 4
>> servers, hosts and guest. I can not tell how and why this just started
>> working, but I got our Datacenter Techs final report this morning,
>> that there are no problems present.
>>
>>
>>
>> Am 19.07.2020 um 09:32 schrieb Andrei Borzenkov:
>>> external/libvirt is unrelated to fence_xvm
>>
>> Could you please explain that a bit more? Do you mean that the current
>> problem of the dysfunctional Stonith/fencing is unrelated to libvirt?
> Hadn't spotted that ... sry
> What he meant is if you are using fence_virtd-service on
> the host(s) then the matching fencing-resource is based
> on fence_xvm and not external/libvirt.
> The libvirt-stuff is handled by the daemon running on your host.
>>
>>> fence_xvm opens TCP listening socket, sends request and waits for
>>> connection to this socket (from fence_virtd) which is used to submit
>>> actual fencing operation. Only the first connection request is handled.
>>> So first host that responds will be processed. Local host is likely
>>> always faster to respond than remote host.
>>
>> Thank you for the explanation, I get that. But what would you suggest
>> to remedy this situation? We have been using libvirt and fence_xvm
>> because of the clusterlabs wiki articles and the suggestions in this
>> mailing list. Is there anything you suggest we need to change to make
>> this Cluster finally work?
> Guess what he meant, what I've already suggested before
> and what is as well described in the
> article linked is having totally separate configurations for
> each host. If you are using different multicast-addresses
> or unicast - as Andrei is suggesting and which I haven't used
> before - probably doesn't matter. (Unless of course something
> is really blocking multicast ...)
> And you have to setup one fencing-resource per host
> (fence_xvm) that has the address configured you've setup
> on each of the hosts.

Thank you for thte explanation. I sadly cannot access the articles. I 
take, totally separate configurations means having a stonith resource 
configured in the cluster for each host. So for now I will delete the 
current resource and try to configure two new ones.

>>
>>
>> Am 18.07.2020 um 02:36 schrieb Reid Wahl:
>>> However, when users want to configure fence_xvm for multiple hosts
>> with the libvirt backend, I have typically seen them configure
>> multiple fence_xvm devices (one per host) and configure a different
>> multicast address on each host.
>>
>> I do have an Red Hat Account but not a payed subscription, which sadly
>> is needed to access the articles you have linked.
>>
>> We have installed fence_virt on both hosts since the beginning, if
>> that is what you mean by " multiple fence_xvm devices (one per host)".
>> They were however both configured to use the same multicast IP Adress,
>> which we now changed so that each hosts fence_xvm install uses a
>> different multicast IP. Sadly this does not seem to change anything in
>> the behaviour.
>> What is interesting though is, that i ran again fence_xvm -c changed
>> the multicast IP to 225.0.0.13 (from .12). I killed and restarted the
>> daemon multiple times after that.
>> When I now run #fence_xvm -o list without specifiying an IP adress
>> tcpdump on the other host still shows the old IP as the originating one.
>> tcpdum on other host:
>>      Host4.54001 > 225.0.0.12.zented: [udp sum ok] UDP, length 176
>>      Host4 > igmp.mcast.net: igmp v3 report, 1 group record(s) [gaddr
>> 225.0.0.12 to_in { }]
>>      Host4 > igmp.mcast.net: igmp v3 report, 1 group record(s) [gaddr
>> 225.0.0.12 to_in { }]
>>      
>> Only when I specify the other IP it apparently really gets used:
>> #  fence_xvm -a 225.0.0.13 -o list
>> tcpdum on other host:
>> Host4.46011 > 225.0.0.13.zented: [udp sum ok] UDP, length 176
>>      Host4 > igmp.mcast.net: igmp v3 report, 1 group record(s) [gaddr
>> 225.0.0.13 to_in { }]
>>      Host4 > igmp.mcast.net: igmp v3 report, 1 group record(s) [gaddr
>> 225.0.0.13 to_in { }]
>>
>>
>>
>>
>> Am 17.07.2020 um 16:49 schrieb Strahil Nikolov:
>>> The simplest way to check if the libvirt's network is NAT (or not)
>> is to try to ssh from the first VM to the second one.
>> That does work without any issue. I can ssh to any server in our
>> network, host or guest, without a problem. Does that mean there is no
>> natting involved?
>>
>>
>>
>> Am 17.07.2020 um 16:41 schrieb Klaus Wenninger:
>>> How does your VM part of the network-config look like?
>> # cat ifcfg-br0
>> DEVICE=br0
>> TYPE=Bridge
>> BOOTPROTO=static
>> ONBOOT=yes
>> IPADDR=192.168.1.13
>> NETMASK=255.255.0.0
>> GATEWAY=192.168.1.1
>> NM_CONTROLLED=no
>> IPV6_AUTOCONF=yes
>> IPV6_DEFROUTE=yes
>> IPV6_PEERDNS=yes
>> IPV6_PEERROUTES=yes
>> IPV6_FAILURE_FATAL=no
>>
>>
>>>> I am at a loss an do not know why this is NAT. I am aware what NAT
>>>> means, but what am I supposed to reconfigure here to dolve the
>> problem?
>>> As long as you stay within the subnet you are running on your bridge
>>> you won't get natted but once it starts to route via the host the
>> libvirt
>>> default bridge will be natted.
>>> What you can do is connect the bridges on your 2 hosts via layer 2.
>>> Possible ways should be OpenVPN, knet, VLAN on your switches ...
>>> (and yes - a cable  )
>>> If your guests are using DHCP you should probably configure
>>> fixed IPs for those MACs.
>> All our server have fixed IPs, DHCP is not used anywhere in our
>> network for dynamic IPs assignment.
>> Regarding the "check if VMs are natted", is this solved by the ssh
>> test suggested by Strahil Nikolov? Can I assume natting is not a
>> problem here or do we still have to take measures?
>>
>>
>>
>> kind regards
>> Stefan Schmitz
>>
>>
>>
>>
>>
>>
>>
>> Am 18.07.2020 um 02:36 schrieb Reid Wahl:
>>> I'm not sure that the libvirt backend is intended to be used in this
>>> way, with multiple hosts using the same multicast address. From the
>>> fence_virt.conf man page:
>>>
>>> ~~~
>>> BACKENDS
>>>       libvirt
>>>           The  libvirt  plugin  is  the  simplest  plugin.  It is used in
>>> environments where routing fencing requests between multiple hosts is
>>> not required, for example by a user running a cluster of virtual
>>>           machines on a single desktop computer.
>>>       libvirt-qmf
>>>           The libvirt-qmf plugin acts as a QMFv2 Console to the
>>> libvirt-qmf daemon in order to route fencing requests over AMQP to the
>>> appropriate computer.
>>>       cpg
>>>           The cpg plugin uses corosync CPG and libvirt to track virtual
>>> machines and route fencing requests to the appropriate computer.
>>> ~~~
>>>
>>> I'm not an expert on fence_xvm or libvirt. It's possible that this is a
>>> viable configuration with the libvirt backend.
>>>
>>> However, when users want to configure fence_xvm for multiple hosts with
>>> the libvirt backend, I have typically seen them configure multiple
>>> fence_xvm devices (one per host) and configure a different multicast
>>> address on each host.
>>>
>>> If you have a Red Hat account, see also:
>>>      - https://access.redhat.com/solutions/2386421#comment-1209661
>>>      - https://access.redhat.com/solutions/2386421#comment-1209801
>>>
>>> On Fri, Jul 17, 2020 at 7:49 AM Strahil Nikolov <hunter86_bg at yahoo.com
>>> <mailto:hunter86_bg at yahoo.com>> wrote:
>>>
>>>       The simplest way to check if the libvirt's network is NAT (or not)
>>>       is to try to ssh from the first VM to the second one.
>>>
>>>       I should admit that I was  lost when I tried  to create a routed
>>>       network in KVM, so I can't help with that.
>>>
>>>       Best Regards,
>>>       Strahil Nikolov
>>>
>>>       На 17 юли 2020 г. 16:56:44 GMT+03:00,
>>>       "stefan.schmitz at farmpartner-tec.com
>>>       <mailto:stefan.schmitz at farmpartner-tec.com>"
>>>       <stefan.schmitz at farmpartner-tec.com
>>>       <mailto:stefan.schmitz at farmpartner-tec.com>> написа:
>>>        >Hello,
>>>        >
>>>        >I have now managed to get # fence_xvm -a 225.0.0.12 -o list to
>>>       list at
>>>        >least its local Guest again. It seems the fence_virtd was not
>> working
>>>        >properly anymore.
>>>        >
>>>        >Regarding the Network XML config
>>>        >
>>>        ># cat default.xml
>>>        >  <network>
>>>        >      <name>default</name>
>>>        >      <bridge name="virbr0"/>
>>>        >      <forward/>
>>>        >      <ip address="192.168.122.1" netmask="255.255.255.0">
>>>        >        <dhcp>
>>>        >          <range start="192.168.122.2" end="192.168.122.254"/>
>>>        >        </dhcp>
>>>        >      </ip>
>>>        >  </network>
>>>        >
>>>        >I have used "virsh net-edit default" to test other network
>> Devices on
>>>        >the hosts but this did not change anything.
>>>        >
>>>        >Regarding the statement
>>>        >
>>>        > > If it is created by libvirt - this is NAT and you will never
>>>        > > receive  output  from the other  host.
>>>        >
>>>        >I am at a loss an do not know why this is NAT. I am aware what
>> NAT
>>>        >means, but what am I supposed to reconfigure here to dolve the
>>>       problem?
>>>        >Any help would be greatly appreciated.
>>>        >Thank you in advance.
>>>        >
>>>        >Kind regards
>>>        >Stefan Schmitz
>>>        >
>>>        >
>>>        >Am 15.07.2020 um 16:48 schrieb stefan.schmitz at farmpartner-tec.com
>>>       <mailto:stefan.schmitz at farmpartner-tec.com>:
>>>        >>
>>>        >> Am 15.07.2020 um 16:29 schrieb Klaus Wenninger:
>>>        >>> On 7/15/20 4:21 PM, stefan.schmitz at farmpartner-tec.com
>>>       <mailto:stefan.schmitz at farmpartner-tec.com> wrote:
>>>        >>>> Hello,
>>>        >>>>
>>>        >>>>
>>>        >>>> Am 15.07.2020 um 15:30 schrieb Klaus Wenninger:
>>>        >>>>> On 7/15/20 3:15 PM, Strahil Nikolov wrote:
>>>        >>>>>> If it is created by libvirt - this is NAT and you will
>> never
>>>        >>>>>> receive  output  from the other  host.
>>>        >>>>> And twice the same subnet behind NAT is probably giving
>>>        >>>>> issues at other places as well.
>>>        >>>>> And if using DHCP you have to at least enforce that both
>> sides
>>>        >>>>> don't go for the same IP at least.
>>>        >>>>> But all no explanation why it doesn't work on the same host.
>>>        >>>>> Which is why I was asking for running the service on the
>>>        >>>>> bridge to check if that would work at least. So that we
>>>        >>>>> can go forward step by step.
>>>        >>>>
>>>        >>>> I just now finished trying and testing it on both hosts.
>>>        >>>> I ran # fence_virtd -c on both hosts and entered different
>> network
>>>        >>>> devices. On both I tried br0 and the kvm10x.0.
>>>        >>> According to your libvirt-config I would have expected
>>>        >>> the bridge to be virbr0.
>>>        >>
>>>        >> I understand that, but an "virbr0" Device does not seem to
>> exist on
>>>        >any
>>>        >> of the two hosts.
>>>        >>
>>>        >> # ip link show
>>>        >> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state
>> UNKNOWN
>>>        >mode
>>>        >> DEFAULT group default qlen 1000
>>>        >>      link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>>>        >> 2: eno1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500
>> qdisc mq
>>>        >> master bond0 state UP mode DEFAULT group default qlen 1000
>>>        >>      link/ether 0c:c4:7a:fb:30:1a brd ff:ff:ff:ff:ff:ff
>>>        >> 3: enp216s0f0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop
>> state DOWN
>>>        >mode
>>>        >> DEFAULT group default qlen 1000
>>>        >>      link/ether ac:1f:6b:26:69:dc brd ff:ff:ff:ff:ff:ff
>>>        >> 4: eno2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500
>> qdisc mq
>>>        >> master bond0 state UP mode DEFAULT group default qlen 1000
>>>        >>      link/ether 0c:c4:7a:fb:30:1a brd ff:ff:ff:ff:ff:ff
>>>        >> 5: enp216s0f1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop
>> state DOWN
>>>        >mode
>>>        >> DEFAULT group default qlen 1000
>>>        >>      link/ether ac:1f:6b:26:69:dd brd ff:ff:ff:ff:ff:ff
>>>        >> 6: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500
>> qdisc
>>>        >> noqueue master br0 state UP mode DEFAULT group default qlen
>> 1000
>>>        >>      link/ether 0c:c4:7a:fb:30:1a brd ff:ff:ff:ff:ff:ff
>>>        >> 7: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
>> noqueue
>>>        >state
>>>        >> UP mode DEFAULT group default qlen 1000
>>>        >>      link/ether 0c:c4:7a:fb:30:1a brd ff:ff:ff:ff:ff:ff
>>>        >> 8: kvm101.0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
>>>        >pfifo_fast
>>>        >> master br0 state UNKNOWN mode DEFAULT group default qlen 1000
>>>        >>      link/ether fe:16:3c:ba:10:6c brd ff:ff:ff:ff:ff:ff
>>>        >>
>>>        >>
>>>        >>
>>>        >>>>
>>>        >>>> After each reconfiguration I ran #fence_xvm -a 225.0.0.12
>> -o list
>>>        >>>> On the second server it worked with each device. After that I
>>>        >>>> reconfigured back to the normal device, bond0, on which it
>> did not
>>>        >>>> work anymore, it worked now again!
>>>        >>>> #  fence_xvm -a 225.0.0.12 -o list
>>>        >>>> kvm102
>>>        >bab3749c-15fc-40b7-8b6c-d4267b9f0eb9 on
>>>        >>>>
>>>        >>>> But anyhow not on the first server, it did not work with any
>>>        >device.
>>>        >>>> #  fence_xvm -a 225.0.0.12 -o list always resulted in
>>>        >>>> Timed out waiting for response
>>>        >>>> Operation failed
>>>        >>>>
>>>        >>>>
>>>        >>>>
>>>        >>>> Am 15.07.2020 um 15:15 schrieb Strahil Nikolov:
>>>        >>>>> If it is created by libvirt - this is NAT and you will never
>>>        >receive
>>>        >>>> output  from the other  host.
>>>        >>>>>
>>>        >>>> To my knowledge this is configured by libvirt. At least I
>> am not
>>>        >aware
>>>        >>>> having changend or configured it in any way. Up until
>> today I did
>>>        >not
>>>        >>>> even know that file existed. Could you please advise on
>> what I
>>>       need
>>>        >to
>>>        >>>> do to fix this issue?
>>>        >>>>
>>>        >>>> Kind regards
>>>        >>>>
>>>        >>>>
>>>        >>>>
>>>        >>>>
>>>        >>>>> Is pacemaker/corosync/knet btw. using the same
>> interfaces/IPs?
>>>        >>>>>
>>>        >>>>> Klaus
>>>        >>>>>>
>>>        >>>>>> Best Regards,
>>>        >>>>>> Strahil Nikolov
>>>        >>>>>>
>>>        >>>>>> На 15 юли 2020 г. 15:05:48 GMT+03:00,
>>>        >>>>>> "stefan.schmitz at farmpartner-tec.com
>>>       <mailto:stefan.schmitz at farmpartner-tec.com>"
>>>        >>>>>> <stefan.schmitz at farmpartner-tec.com
>>>       <mailto:stefan.schmitz at farmpartner-tec.com>> написа:
>>>        >>>>>>> Hello,
>>>        >>>>>>>
>>>        >>>>>>> Am 15.07.2020 um 13:42 Strahil Nikolov wrote:
>>>        >>>>>>>> By default libvirt is using NAT and not routed network
>> - in
>>>        >such
>>>        >>>>>>> case, vm1 won't receive data from host2.
>>>        >>>>>>>> Can you provide the Networks' xml ?
>>>        >>>>>>>>
>>>        >>>>>>>> Best Regards,
>>>        >>>>>>>> Strahil Nikolov
>>>        >>>>>>>>
>>>        >>>>>>> # cat default.xml
>>>        >>>>>>> <network>
>>>        >>>>>>>     <name>default</name>
>>>        >>>>>>>     <bridge name="virbr0"/>
>>>        >>>>>>>     <forward/>
>>>        >>>>>>>     <ip address="192.168.122.1" netmask="255.255.255.0">
>>>        >>>>>>>       <dhcp>
>>>        >>>>>>>         <range start="192.168.122.2"
>> end="192.168.122.254"/>
>>>        >>>>>>>       </dhcp>
>>>        >>>>>>>     </ip>
>>>        >>>>>>> </network>
>>>        >>>>>>>
>>>        >>>>>>> I just checked this and the file is identical on both
>> hosts.
>>>        >>>>>>>
>>>        >>>>>>> kind regards
>>>        >>>>>>> Stefan Schmitz
>>>        >>>>>>>
>>>        >>>>>>>
>>>        >>>>>>>> На 15 юли 2020 г. 13:19:59 GMT+03:00, Klaus Wenninger
>>>        >>>>>>> <kwenning at redhat.com <mailto:kwenning at redhat.com>> написа:
>>>        >>>>>>>>> On 7/15/20 11:42 AM, stefan.schmitz at farmpartner-tec.com
>>>       <mailto:stefan.schmitz at farmpartner-tec.com> wrote:
>>>        >>>>>>>>>> Hello,
>>>        >>>>>>>>>>
>>>        >>>>>>>>>>
>>>        >>>>>>>>>> Am 15.07.2020 um 06:32 Strahil Nikolov wrote:
>>>        >>>>>>>>>>> How  did you configure the network on your ubuntu
>> 20.04
>>>        >Hosts ? I
>>>        >>>>>>>>>>> tried  to setup bridged connection for the test
>> setup , but
>>>        >>>>>>>>> obviously
>>>        >>>>>>>>>>> I'm missing something.
>>>        >>>>>>>>>>>
>>>        >>>>>>>>>>> Best Regards,
>>>        >>>>>>>>>>> Strahil Nikolov
>>>        >>>>>>>>>>>
>>>        >>>>>>>>>> on the hosts (CentOS) the bridge config looks like
>> that.The
>>>        >>>>>>> bridging
>>>        >>>>>>>>>> and configuration is handled by the virtualization
>> software:
>>>        >>>>>>>>>>
>>>        >>>>>>>>>> # cat ifcfg-br0
>>>        >>>>>>>>>> DEVICE=br0
>>>        >>>>>>>>>> TYPE=Bridge
>>>        >>>>>>>>>> BOOTPROTO=static
>>>        >>>>>>>>>> ONBOOT=yes
>>>        >>>>>>>>>> IPADDR=192.168.1.21
>>>        >>>>>>>>>> NETMASK=255.255.0.0
>>>        >>>>>>>>>> GATEWAY=192.168.1.1
>>>        >>>>>>>>>> NM_CONTROLLED=no
>>>        >>>>>>>>>> IPV6_AUTOCONF=yes
>>>        >>>>>>>>>> IPV6_DEFROUTE=yes
>>>        >>>>>>>>>> IPV6_PEERDNS=yes
>>>        >>>>>>>>>> IPV6_PEERROUTES=yes
>>>        >>>>>>>>>> IPV6_FAILURE_FATAL=no
>>>        >>>>>>>>>>
>>>        >>>>>>>>>>
>>>        >>>>>>>>>>
>>>        >>>>>>>>>> Am 15.07.2020 um 09:50 Klaus Wenninger wrote:
>>>        >>>>>>>>>>> Guess it is not easy to have your servers connected
>>>        >physically
>>>        >>>>>>>>>>> for
>>>        >>>>>>>>> a
>>>        >>>>>>>>>> try.
>>>        >>>>>>>>>>> But maybe you can at least try on one host to have
>>>        >virt_fenced &
>>>        >>>>>>> VM
>>>        >>>>>>>>>>> on the same bridge - just to see if that basic
>> pattern is
>>>        >>>>>>>>>>> working.
>>>        >>>>>>>>>> I am not sure if I understand you correctly. What do
>> you by
>>>        >having
>>>        >>>>>>>>>> them on the same bridge? The bridge device is
>> configured on
>>>        >the
>>>        >>>>>>> host
>>>        >>>>>>>>>> by the virtualization software.
>>>        >>>>>>>>> I meant to check out which bridge the interface of
>> the VM is
>>>        >>>>>>> enslaved
>>>        >>>>>>>>> to and to use that bridge as interface in
>>>        >/etc/fence_virt.conf.
>>>        >>>>>>>>> Get me right - just for now - just to see if it is
>>>       working for
>>>        >this
>>>        >>>>>>> one
>>>        >>>>>>>>> host and the corresponding guest.
>>>        >>>>>>>>>>
>>>        >>>>>>>>>>> Well maybe still sbdy in the middle playing IGMPv3
>> or the
>>>        >request
>>>        >>>>>>>>> for
>>>        >>>>>>>>>>> a certain source is needed to shoot open some
>> firewall or
>>>        >>>>>>>>> switch-tables.
>>>        >>>>>>>>>> I am still waiting for the final report from our Data
>>>       Center
>>>        >>>>>>>>>> techs.
>>>        >>>>>>> I
>>>        >>>>>>>>>> hope that will clear up somethings.
>>>        >>>>>>>>>>
>>>        >>>>>>>>>>
>>>        >>>>>>>>>> Additionally  I have just noticed that apparently since
>>>        >switching
>>>        >>>>>>>>> from
>>>        >>>>>>>>>> IGMPv3 to IGMPv2 and back the command "fence_xvm -a
>>>        >225.0.0.12 -o
>>>        >>>>>>>>>> list" is no completely broken.
>>>        >>>>>>>>>> Before that switch this command at least returned
>> the local
>>>        >VM.
>>>        >>>>>>>>>> Now
>>>        >>>>>>>>> it
>>>        >>>>>>>>>> returns:
>>>        >>>>>>>>>> Timed out waiting for response
>>>        >>>>>>>>>> Operation failed
>>>        >>>>>>>>>>
>>>        >>>>>>>>>> I am a bit confused by that, because all we did was
>> running
>>>        >>>>>>> commands
>>>        >>>>>>>>>> like "sysctl -w net.ipv4.conf.all.force_igmp_version
>> =" with
>>>        >the
>>>        >>>>>>>>>> different Version umbers and #cat /proc/net/igmp
>> shows that
>>>        >V3 is
>>>        >>>>>>>>> used
>>>        >>>>>>>>>> again on every device just like before...?!
>>>        >>>>>>>>>>
>>>        >>>>>>>>>> kind regards
>>>        >>>>>>>>>> Stefan Schmitz
>>>        >>>>>>>>>>
>>>        >>>>>>>>>>
>>>        >>>>>>>>>>> На 14 юли 2020 г. 11:06:42 GMT+03:00,
>>>        >>>>>>>>>>> "stefan.schmitz at farmpartner-tec.com
>>>       <mailto:stefan.schmitz at farmpartner-tec.com>"
>>>        >>>>>>>>>>> <stefan.schmitz at farmpartner-tec.com
>>>       <mailto:stefan.schmitz at farmpartner-tec.com>> написа:
>>>        >>>>>>>>>>>> Hello,
>>>        >>>>>>>>>>>>
>>>        >>>>>>>>>>>>
>>>        >>>>>>>>>>>> Am 09.07.2020 um 19:10 Strahil Nikolov wrote:
>>>        >>>>>>>>>>>>> Have  you  run 'fence_virtd  -c' ?
>>>        >>>>>>>>>>>> Yes I had run that on both Hosts. The current
>> config looks
>>>        >like
>>>        >>>>>>>>> that
>>>        >>>>>>>>>>>> and
>>>        >>>>>>>>>>>> is identical on both.
>>>        >>>>>>>>>>>>
>>>        >>>>>>>>>>>> cat fence_virt.conf
>>>        >>>>>>>>>>>> fence_virtd {
>>>        >>>>>>>>>>>>             listener = "multicast";
>>>        >>>>>>>>>>>>             backend = "libvirt";
>>>        >>>>>>>>>>>>             module_path = "/usr/lib64/fence-virt";
>>>        >>>>>>>>>>>> }
>>>     ��  >>>>>>>>>>>>
>>>        >>>>>>>>>>>> listeners {
>>>        >>>>>>>>>>>>             multicast {
>>>        >>>>>>>>>>>>                     key_file =
>>>        >"/etc/cluster/fence_xvm.key";
>>>        >>>>>>>>>>>>                     address = "225.0.0.12";
>>>        >>>>>>>>>>>>                     interface = "bond0";
>>>        >>>>>>>>>>>>                     family = "ipv4";
>>>        >>>>>>>>>>>>                     port = "1229";
>>>        >>>>>>>>>>>>             }
>>>        >>>>>>>>>>>>
>>>        >>>>>>>>>>>> }
>>>        >>>>>>>>>>>>
>>>        >>>>>>>>>>>> backends {
>>>        >>>>>>>>>>>>             libvirt {
>>>        >>>>>>>>>>>>                     uri = "qemu:///system";
>>>        >>>>>>>>>>>>             }
>>>        >>>>>>>>>>>>
>>>        >>>>>>>>>>>> }
>>>        >>>>>>>>>>>>
>>>        >>>>>>>>>>>>
>>>        >>>>>>>>>>>> The situation is still that no matter on what host
>> I issue
>>>        >the
>>>        >>>>>>>>>>>> "fence_xvm -a 225.0.0.12 -o list" command, both guest
>>>        >systems
>>>        >>>>>>>>> receive
>>>        >>>>>>>>>>>> the traffic. The local guest, but also the guest
>> on the
>>>        >other
>>>        >>>>>>> host.
>>>        >>>>>>>>> I
>>>        >>>>>>>>>>>> reckon that means the traffic is not filtered by any
>>>        >network
>>>        >>>>>>>>> device,
>>>        >>>>>>>>>>>> like switches or firewalls. Since the guest on the
>> other
>>>        >host
>>>        >>>>>>>>> receives
>>>        >>>>>>>>>>>> the packages, the traffic must reach te physical
>>>       server and
>>>        >>>>>>>>>>>> networkdevice and is then routed to the VM on that
>> host.
>>>        >>>>>>>>>>>> But still, the traffic is not shown on the host
>> itself.
>>>        >>>>>>>>>>>>
>>>        >>>>>>>>>>>> Further the local firewalls on both hosts are set
>> to let
>>>        >each
>>>        >>>>>>>>>>>> and
>>>        >>>>>>>>> every
>>>        >>>>>>>>>>>> traffic pass. Accept to any and everything. Well
>> at least
>>>        >as far
>>>        >>>>>>> as
>>>        >>>>>>>>> I
>>>        >>>>>>>>>>>> can see.
>>>        >>>>>>>>>>>>
>>>        >>>>>>>>>>>>
>>>        >>>>>>>>>>>> Am 09.07.2020 um 22:34 Klaus Wenninger wrote:
>>>        >>>>>>>>>>>>> makes me believe that
>>>        >>>>>>>>>>>>> the whole setup doesn't lookas I would have
>>>        >>>>>>>>>>>>> expected (bridges on each host where theguest
>>>        >>>>>>>>>>>>> has a connection to and where ethernet interfaces
>>>        >>>>>>>>>>>>> that connect the 2 hosts are part of as well
>>>        >>>>>>>>>>>> On each physical server the networkcards are
>> bonded to
>>>        >achieve
>>>        >>>>>>>>> failure
>>>        >>>>>>>>>>>> safety (bond0). The guest are connected over a
>> bridge(br0)
>>>        >but
>>>        >>>>>>>>>>>> apparently our virtualization softrware creates an
>> own
>>>        >device
>>>        >>>>>>> named
>>>        >>>>>>>>>>>> after the guest (kvm101.0).
>>>        >>>>>>>>>>>> There is no direct connection between the servers,
>> but
>>>       as I
>>>        >said
>>>        >>>>>>>>>>>> earlier, the multicast traffic does reach the VMs
>> so I
>>>        >assume
>>>        >>>>>>> there
>>>        >>>>>>>>> is
>>>        >>>>>>>>>>>> no problem with that.
>>>        >>>>>>>>>>>>
>>>        >>>>>>>>>>>>
>>>        >>>>>>>>>>>> Am 09.07.2020 um 20:18 Vladislav Bogdanov wrote:
>>>        >>>>>>>>>>>>> First, you need to ensure that your switch (or all
>>>        >switches in
>>>        >>>>>>> the
>>>        >>>>>>>>>>>>> path) have igmp snooping enabled on host ports (and
>>>        >probably
>>>        >>>>>>>>>>>>> interconnects along the path between your hosts).
>>>        >>>>>>>>>>>>>
>>>        >>>>>>>>>>>>> Second, you need an igmp querier to be enabled
>> somewhere
>>>        >near
>>>        >>>>>>>>> (better
>>>        >>>>>>>>>>>>> to have it enabled on a switch itself). Please
>> verify
>>>       that
>>>        >you
>>>        >>>>>>> see
>>>        >>>>>>>>>>>> its
>>>        >>>>>>>>>>>>> queries on hosts.
>>>        >>>>>>>>>>>>>
>>>        >>>>>>>>>>>>> Next, you probably need to make your hosts to use
>> IGMPv2
>>>        >>>>>>>>>>>>> (not 3)
>>>        >>>>>>>>> as
>>>        >>>>>>>>>>>>> many switches still can not understand v3. This
>> is doable
>>>        >by
>>>        >>>>>>>>> sysctl,
>>>        >>>>>>>>>>>>> find on internet, there are many articles.
>>>        >>>>>>>>>>>>
>>>        >>>>>>>>>>>> I have send an query to our Data center Techs who are
>>>        >analyzing
>>>        >>>>>>>>> this
>>>        >>>>>>>>>>>> and
>>>        >>>>>>>>>>>> were already on it analyzing if multicast Traffic is
>>>        >somewhere
>>>        >>>>>>>>> blocked
>>>        >>>>>>>>>>>> or hindered. So far the answer is, "multicast ist
>>>       explictly
>>>        >>>>>>> allowed
>>>        >>>>>>>>> in
>>>        >>>>>>>>>>>> the local network and no packets are filtered or
>> dropped".
>>>        >I am
>>>        >>>>>>>>> still
>>>        >>>>>>>>>>>> waiting for a final report though.
>>>        >>>>>>>>>>>>
>>>        >>>>>>>>>>>> In the meantime I have switched IGMPv3 to IGMPv2
>> on every
>>>        >>>>>>> involved
>>>        >>>>>>>>>>>> server, hosts and guests via the mentioned sysctl.
>> The
>>>        >switching
>>>        >>>>>>>>> itself
>>>        >>>>>>>>>>>> was successful, according to "cat /proc/net/igmp" but
>>>       sadly
>>>        >did
>>>        >>>>>>> not
>>>        >>>>>>>>>>>> better the behavior. It actually led to that no VM
>>>       received
>>>        >the
>>>        >>>>>>>>>>>> multicast traffic anymore too.
>>>        >>>>>>>>>>>>
>>>        >>>>>>>>>>>> kind regards
>>>        >>>>>>>>>>>> Stefan Schmitz
>>>        >>>>>>>>>>>>
>>>        >>>>>>>>>>>>
>>>        >>>>>>>>>>>> Am 09.07.2020 um 22:34 schrieb Klaus Wenninger:
>>>        >>>>>>>>>>>>> On 7/9/20 5:17 PM,
>> stefan.schmitz at farmpartner-tec.com
>>>       <mailto:stefan.schmitz at farmpartner-tec.com>
>>>        >wrote:
>>>        >>>>>>>>>>>>>> Hello,
>>>        >>>>>>>>>>>>>>
>>>        >>>>>>>>>>>>>>> Well, theory still holds I would say.
>>>        >>>>>>>>>>>>>>>
>>>        >>>>>>>>>>>>>>> I guess that the multicast-traffic from the
>> other host
>>>        >>>>>>>>>>>>>>> or the guestsdoesn't get to the daemon on the
>> host.
>>>        >>>>>>>>>>>>>>> Can't you just simply check if there are any
>> firewall
>>>        >>>>>>>>>>>>>>> rules configuredon the host kernel?
>>>        >>>>>>>>>>>>>> I hope I did understand you corretcly and you are
>>>        >referring to
>>>        >>>>>>>>>>>> iptables?
>>>        >>>>>>>>>>>>> I didn't say iptables because it might have been
>>>        >>>>>>>>>>>>> nftables - but yesthat is what I was referring to.
>>>        >>>>>>>>>>>>> Guess to understand the config the output is
>>>        >>>>>>>>>>>>> lacking verbositybut it makes me believe that
>>>        >>>>>>>>>>>>> the whole setup doesn't lookas I would have
>>>        >>>>>>>>>>>>> expected (bridges on each host where theguest
>>>        >>>>>>>>>>>>> has a connection to and where ethernet interfaces
>>>        >>>>>>>>>>>>> that connect the 2 hosts are part of as well -
>>>        >>>>>>>>>>>>> everythingconnected via layer 2 basically).
>>>        >>>>>>>>>>>>>> Here is the output of the current rules. Besides
>> the IP
>>>        >of the
>>>        >>>>>>>>> guest
>>>        >>>>>>>>>>>>>> the output is identical on both hosts:
>>>        >>>>>>>>>>>>>>
>>>        >>>>>>>>>>>>>> # iptables -S
>>>        >>>>>>>>>>>>>> -P INPUT ACCEPT
>>>        >>>>>>>>>>>>>> -P FORWARD ACCEPT
>>>        >>>>>>>>>>>>>> -P OUTPUT ACCEPT
>>>        >>>>>>>>>>>>>>
>>>        >>>>>>>>>>>>>> # iptables -L
>>>        >>>>>>>>>>>>>> Chain INPUT (policy ACCEPT)
>>>        >>>>>>>>>>>>>> target     prot opt source
>> destination
>>>        >>>>>>>>>>>>>>
>>>        >>>>>>>>>>>>>> Chain FORWARD (policy ACCEPT)
>>>        >>>>>>>>>>>>>> target     prot opt source
>> destination
>>>        >>>>>>>>>>>>>> SOLUSVM_TRAFFIC_IN  all  --  anywhere
>>>        >anywhere
>>>        >>>>>>>>>>>>>> SOLUSVM_TRAFFIC_OUT  all  --  anywhere
>>>        >anywhere
>>>        >>>>>>>>>>>>>>
>>>        >>>>>>>>>>>>>> Chain OUTPUT (policy ACCEPT)
>>>        >>>>>>>>>>>>>> target     prot opt source
>> destination
>>>        >>>>>>>>>>>>>>
>>>        >>>>>>>>>>>>>> Chain SOLUSVM_TRAFFIC_IN (1 references)
>>>        >>>>>>>>>>>>>> target     prot opt source
>> destination
>>>        >>>>>>>>>>>>>>                 all  --  anywhere
>>>        >192.168.1.14
>>>        >>>>>>>>>>>>>>
>>>        >>>>>>>>>>>>>> Chain SOLUSVM_TRAFFIC_OUT (1 references)
>>>        >>>>>>>>>>>>>> target     prot opt source
>> destination
>>>        >>>>>>>>>>>>>>                 all  --  192.168.1.14 anywhere
>>>        >>>>>>>>>>>>>>
>>>        >>>>>>>>>>>>>> kind regards
>>>        >>>>>>>>>>>>>> Stefan Schmitz
>>>        >>>>>>>>>>>>>>
>>>        >>>>>>>>>>>>>>
>>>        >>>>>> _______________________________________________
>>>        >>>>>> Manage your subscription:
>>>        >>>>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>        >>>>>>
>>>        >>>>>> ClusterLabs home: https://www.clusterlabs.org/
>>>        >>>>>
>>>        >>>>
>>>        >>>
>>>        >> _______________________________________________
>>>        >> Manage your subscription:
>>>        >> https://lists.clusterlabs.org/mailman/listinfo/users
>>>        >>
>>>        >> ClusterLabs home: https://www.clusterlabs.org/
>>>        >_______________________________________________
>>>        >Manage your subscription:
>>>        >https://lists.clusterlabs.org/mailman/listinfo/users
>>>        >
>>>        >ClusterLabs home: https://www.clusterlabs.org/
>>>       _______________________________________________
>>>       Manage your subscription:
>>>       https://lists.clusterlabs.org/mailman/listinfo/users
>>>
>>>       ClusterLabs home: https://www.clusterlabs.org/
>>>
>>>
>>>
>>> --
>>> Regards,
>>>
>>> Reid Wahl, RHCA
>>> Software Maintenance Engineer, Red Hat
>>> CEE - Platform Support Delivery - ClusterHA
>>
>