[ClusterLabs] Still Beginner STONITH Problem

Mon Jul 20 07:10:58 EDT 2020

Hello,

thank you all very much for your help so far!

We have no managed to capture the mulitcast traffic originating from one 
host when issuing the command "fence_xvm -o list" on the other host. Now 
the tcpdump at least looks exactly the same on all 4 servers, hosts and 
guest. I can not tell how and why this just started working, but I got 
our Datacenter Techs final report this morning, that there are no 
problems present.

Am 19.07.2020 um 09:32 schrieb Andrei Borzenkov:
 >external/libvirt is unrelated to fence_xvm

Could you please explain that a bit more? Do you mean that the current 
problem of the dysfunctional Stonith/fencing is unrelated to libvirt?

 >fence_xvm opens TCP listening socket, sends request and waits for
 >connection to this socket (from fence_virtd) which is used to submit
 >actual fencing operation. Only the first connection request is handled.
 >So first host that responds will be processed. Local host is likely
 >always faster to respond than remote host.

Thank you for the explanation, I get that. But what would you suggest to 
remedy this situation? We have been using libvirt and fence_xvm because 
of the clusterlabs wiki articles and the suggestions in this mailing 
list. Is there anything you suggest we need to change to make this 
Cluster finally work?

Am 18.07.2020 um 02:36 schrieb Reid Wahl:
 >However, when users want to configure fence_xvm for multiple hosts 
with the libvirt backend, I have typically seen them configure multiple 
fence_xvm devices (one per host) and configure a different multicast 
address on each host.

I do have an Red Hat Account but not a payed subscription, which sadly 
is needed to access the articles you have linked.

We have installed fence_virt on both hosts since the beginning, if that 
is what you mean by " multiple fence_xvm devices (one per host)". They 
were however both configured to use the same multicast IP Adress, which 
we now changed so that each hosts fence_xvm install uses a different 
multicast IP. Sadly this does not seem to change anything in the behaviour.
What is interesting though is, that i ran again fence_xvm -c changed the 
multicast IP to 225.0.0.13 (from .12). I killed and restarted the daemon 
multiple times after that.
When I now run #fence_xvm -o list without specifiying an IP adress 
tcpdump on the other host still shows the old IP as the originating one.
tcpdum on other host:
     Host4.54001 > 225.0.0.12.zented: [udp sum ok] UDP, length 176
     Host4 > igmp.mcast.net: igmp v3 report, 1 group record(s) [gaddr 
225.0.0.12 to_in { }]
     Host4 > igmp.mcast.net: igmp v3 report, 1 group record(s) [gaddr 
225.0.0.12 to_in { }]

Only when I specify the other IP it apparently really gets used:
#  fence_xvm -a 225.0.0.13 -o list
tcpdum on other host:
Host4.46011 > 225.0.0.13.zented: [udp sum ok] UDP, length 176
     Host4 > igmp.mcast.net: igmp v3 report, 1 group record(s) [gaddr 
225.0.0.13 to_in { }]
     Host4 > igmp.mcast.net: igmp v3 report, 1 group record(s) [gaddr 
225.0.0.13 to_in { }] 	

Am 17.07.2020 um 16:49 schrieb Strahil Nikolov:
 >The simplest way to check if the libvirt's network is NAT (or not)  is 
to try to ssh from the first VM to the second one.
That does work without any issue. I can ssh to any server in our 
network, host or guest, without a problem. Does that mean there is no 
natting involved?

Am 17.07.2020 um 16:41 schrieb Klaus Wenninger:
 >How does your VM part of the network-config look like?
# cat ifcfg-br0
DEVICE=br0
TYPE=Bridge
BOOTPROTO=static
ONBOOT=yes
IPADDR=192.168.1.13
NETMASK=255.255.0.0
GATEWAY=192.168.1.1
NM_CONTROLLED=no
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_PEERDNS=yes
IPV6_PEERROUTES=yes
IPV6_FAILURE_FATAL=no

 >> I am at a loss an do not know why this is NAT. I am aware what NAT
 >> means, but what am I supposed to reconfigure here to dolve the problem?
 >As long as you stay within the subnet you are running on your bridge
 >you won't get natted but once it starts to route via the host the libvirt
 >default bridge will be natted.
 >What you can do is connect the bridges on your 2 hosts via layer 2.
 >Possible ways should be OpenVPN, knet, VLAN on your switches ...
 >(and yes - a cable  )
 >If your guests are using DHCP you should probably configure
 >fixed IPs for those MACs.
All our server have fixed IPs, DHCP is not used anywhere in our network 
for dynamic IPs assignment.
Regarding the "check if VMs are natted", is this solved by the ssh test 
suggested by Strahil Nikolov? Can I assume natting is not a problem here 
or do we still have to take measures?

kind regards
Stefan Schmitz

Am 18.07.2020 um 02:36 schrieb Reid Wahl:
 > I'm not sure that the libvirt backend is intended to be used in this
 > way, with multiple hosts using the same multicast address. From the
 > fence_virt.conf man page:
 >
 > ~~~
 > BACKENDS
 >     libvirt
 >         The  libvirt  plugin  is  the  simplest  plugin.  It is used in
 > environments where routing fencing requests between multiple hosts is
 > not required, for example by a user running a cluster of virtual
 >         machines on a single desktop computer.
 >     libvirt-qmf
 >         The libvirt-qmf plugin acts as a QMFv2 Console to the
 > libvirt-qmf daemon in order to route fencing requests over AMQP to the
 > appropriate computer.
 >     cpg
 >         The cpg plugin uses corosync CPG and libvirt to track virtual
 > machines and route fencing requests to the appropriate computer.
 > ~~~
 >
 > I'm not an expert on fence_xvm or libvirt. It's possible that this is a
 > viable configuration with the libvirt backend.
 >
 > However, when users want to configure fence_xvm for multiple hosts with
 > the libvirt backend, I have typically seen them configure multiple
 > fence_xvm devices (one per host) and configure a different multicast
 > address on each host.
 >
 > If you have a Red Hat account, see also:
 >    - https://access.redhat.com/solutions/2386421#comment-1209661
 >    - https://access.redhat.com/solutions/2386421#comment-1209801
 >
 > On Fri, Jul 17, 2020 at 7:49 AM Strahil Nikolov <hunter86_bg at yahoo.com
 > <mailto:hunter86_bg at yahoo.com>> wrote:
 >
 >     The simplest way to check if the libvirt's network is NAT (or not)
 >     is to try to ssh from the first VM to the second one.
 >
 >     I should admit that I was  lost when I tried  to create a routed
 >     network in KVM, so I can't help with that.
 >
 >     Best Regards,
 >     Strahil Nikolov
 >
 >     На 17 юли 2020 г. 16:56:44 GMT+03:00,
 >     "stefan.schmitz at farmpartner-tec.com
 >     <mailto:stefan.schmitz at farmpartner-tec.com>"
 >     <stefan.schmitz at farmpartner-tec.com
 >     <mailto:stefan.schmitz at farmpartner-tec.com>> написа:
 >      >Hello,
 >      >
 >      >I have now managed to get # fence_xvm -a 225.0.0.12 -o list to
 >     list at
 >      >least its local Guest again. It seems the fence_virtd was not 
working
 >      >properly anymore.
 >      >
 >      >Regarding the Network XML config
 >      >
 >      ># cat default.xml
 >      >  <network>
 >      >      <name>default</name>
 >      >      <bridge name="virbr0"/>
 >      >      <forward/>
 >      >      <ip address="192.168.122.1" netmask="255.255.255.0">
 >      >        <dhcp>
 >      >          <range start="192.168.122.2" end="192.168.122.254"/>
 >      >        </dhcp>
 >      >      </ip>
 >      >  </network>
 >      >
 >      >I have used "virsh net-edit default" to test other network 
Devices on
 >      >the hosts but this did not change anything.
 >      >
 >      >Regarding the statement
 >      >
 >      > > If it is created by libvirt - this is NAT and you will never
 >      > > receive  output  from the other  host.
 >      >
 >      >I am at a loss an do not know why this is NAT. I am aware what NAT
 >      >means, but what am I supposed to reconfigure here to dolve the
 >     problem?
 >      >Any help would be greatly appreciated.
 >      >Thank you in advance.
 >      >
 >      >Kind regards
 >      >Stefan Schmitz
 >      >
 >      >
 >      >Am 15.07.2020 um 16:48 schrieb stefan.schmitz at farmpartner-tec.com
 >     <mailto:stefan.schmitz at farmpartner-tec.com>:
 >      >>
 >      >> Am 15.07.2020 um 16:29 schrieb Klaus Wenninger:
 >      >>> On 7/15/20 4:21 PM, stefan.schmitz at farmpartner-tec.com
 >     <mailto:stefan.schmitz at farmpartner-tec.com> wrote:
 >      >>>> Hello,
 >      >>>>
 >      >>>>
 >      >>>> Am 15.07.2020 um 15:30 schrieb Klaus Wenninger:
 >      >>>>> On 7/15/20 3:15 PM, Strahil Nikolov wrote:
 >      >>>>>> If it is created by libvirt - this is NAT and you will never
 >      >>>>>> receive  output  from the other  host.
 >      >>>>> And twice the same subnet behind NAT is probably giving
 >      >>>>> issues at other places as well.
 >      >>>>> And if using DHCP you have to at least enforce that both sides
 >      >>>>> don't go for the same IP at least.
 >      >>>>> But all no explanation why it doesn't work on the same host.
 >      >>>>> Which is why I was asking for running the service on the
 >      >>>>> bridge to check if that would work at least. So that we
 >      >>>>> can go forward step by step.
 >      >>>>
 >      >>>> I just now finished trying and testing it on both hosts.
 >      >>>> I ran # fence_virtd -c on both hosts and entered different 
network
 >      >>>> devices. On both I tried br0 and the kvm10x.0.
 >      >>> According to your libvirt-config I would have expected
 >      >>> the bridge to be virbr0.
 >      >>
 >      >> I understand that, but an "virbr0" Device does not seem to 
exist on
 >      >any
 >      >> of the two hosts.
 >      >>
 >      >> # ip link show
 >      >> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state 
UNKNOWN
 >      >mode
 >      >> DEFAULT group default qlen 1000
 >      >>      link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
 >      >> 2: eno1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 
qdisc mq
 >      >> master bond0 state UP mode DEFAULT group default qlen 1000
 >      >>      link/ether 0c:c4:7a:fb:30:1a brd ff:ff:ff:ff:ff:ff
 >      >> 3: enp216s0f0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop 
state DOWN
 >      >mode
 >      >> DEFAULT group default qlen 1000
 >      >>      link/ether ac:1f:6b:26:69:dc brd ff:ff:ff:ff:ff:ff
 >      >> 4: eno2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 
qdisc mq
 >      >> master bond0 state UP mode DEFAULT group default qlen 1000
 >      >>      link/ether 0c:c4:7a:fb:30:1a brd ff:ff:ff:ff:ff:ff
 >      >> 5: enp216s0f1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop 
state DOWN
 >      >mode
 >      >> DEFAULT group default qlen 1000
 >      >>      link/ether ac:1f:6b:26:69:dd brd ff:ff:ff:ff:ff:ff
 >      >> 6: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc
 >      >> noqueue master br0 state UP mode DEFAULT group default qlen 1000
 >      >>      link/ether 0c:c4:7a:fb:30:1a brd ff:ff:ff:ff:ff:ff
 >      >> 7: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
 >      >state
 >      >> UP mode DEFAULT group default qlen 1000
 >      >>      link/ether 0c:c4:7a:fb:30:1a brd ff:ff:ff:ff:ff:ff
 >      >> 8: kvm101.0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
 >      >pfifo_fast
 >      >> master br0 state UNKNOWN mode DEFAULT group default qlen 1000
 >      >>      link/ether fe:16:3c:ba:10:6c brd ff:ff:ff:ff:ff:ff
 >      >>
 >      >>
 >      >>
 >      >>>>
 >      >>>> After each reconfiguration I ran #fence_xvm -a 225.0.0.12 
-o list
 >      >>>> On the second server it worked with each device. After that I
 >      >>>> reconfigured back to the normal device, bond0, on which it 
did not
 >      >>>> work anymore, it worked now again!
 >      >>>> #  fence_xvm -a 225.0.0.12 -o list
 >      >>>> kvm102
 >      >bab3749c-15fc-40b7-8b6c-d4267b9f0eb9 on
 >      >>>>
 >      >>>> But anyhow not on the first server, it did not work with any
 >      >device.
 >      >>>> #  fence_xvm -a 225.0.0.12 -o list always resulted in
 >      >>>> Timed out waiting for response
 >      >>>> Operation failed
 >      >>>>
 >      >>>>
 >      >>>>
 >      >>>> Am 15.07.2020 um 15:15 schrieb Strahil Nikolov:
 >      >>>>> If it is created by libvirt - this is NAT and you will never
 >      >receive
 >      >>>> output  from the other  host.
 >      >>>>>
 >      >>>> To my knowledge this is configured by libvirt. At least I 
am not
 >      >aware
 >      >>>> having changend or configured it in any way. Up until today 
I did
 >      >not
 >      >>>> even know that file existed. Could you please advise on what I
 >     need
 >      >to
 >      >>>> do to fix this issue?
 >      >>>>
 >      >>>> Kind regards
 >      >>>>
 >      >>>>
 >      >>>>
 >      >>>>
 >      >>>>> Is pacemaker/corosync/knet btw. using the same interfaces/IPs?
 >      >>>>>
 >      >>>>> Klaus
 >      >>>>>>
 >      >>>>>> Best Regards,
 >      >>>>>> Strahil Nikolov
 >      >>>>>>
 >      >>>>>> На 15 юли 2020 г. 15:05:48 GMT+03:00,
 >      >>>>>> "stefan.schmitz at farmpartner-tec.com
 >     <mailto:stefan.schmitz at farmpartner-tec.com>"
 >      >>>>>> <stefan.schmitz at farmpartner-tec.com
 >     <mailto:stefan.schmitz at farmpartner-tec.com>> написа:
 >      >>>>>>> Hello,
 >      >>>>>>>
 >      >>>>>>> Am 15.07.2020 um 13:42 Strahil Nikolov wrote:
 >      >>>>>>>> By default libvirt is using NAT and not routed network - in
 >      >such
 >      >>>>>>> case, vm1 won't receive data from host2.
 >      >>>>>>>> Can you provide the Networks' xml ?
 >      >>>>>>>>
 >      >>>>>>>> Best Regards,
 >      >>>>>>>> Strahil Nikolov
 >      >>>>>>>>
 >      >>>>>>> # cat default.xml
 >      >>>>>>> <network>
 >      >>>>>>>     <name>default</name>
 >      >>>>>>>     <bridge name="virbr0"/>
 >      >>>>>>>     <forward/>
 >      >>>>>>>     <ip address="192.168.122.1" netmask="255.255.255.0">
 >      >>>>>>>       <dhcp>
 >      >>>>>>>         <range start="192.168.122.2" end="192.168.122.254"/>
 >      >>>>>>>       </dhcp>
 >      >>>>>>>     </ip>
 >      >>>>>>> </network>
 >      >>>>>>>
 >      >>>>>>> I just checked this and the file is identical on both hosts.
 >      >>>>>>>
 >      >>>>>>> kind regards
 >      >>>>>>> Stefan Schmitz
 >      >>>>>>>
 >      >>>>>>>
 >      >>>>>>>> На 15 юли 2020 г. 13:19:59 GMT+03:00, Klaus Wenninger
 >      >>>>>>> <kwenning at redhat.com <mailto:kwenning at redhat.com>> написа:
 >      >>>>>>>>> On 7/15/20 11:42 AM, stefan.schmitz at farmpartner-tec.com
 >     <mailto:stefan.schmitz at farmpartner-tec.com> wrote:
 >      >>>>>>>>>> Hello,
 >      >>>>>>>>>>
 >      >>>>>>>>>>
 >      >>>>>>>>>> Am 15.07.2020 um 06:32 Strahil Nikolov wrote:
 >      >>>>>>>>>>> How  did you configure the network on your ubuntu 20.04
 >      >Hosts ? I
 >      >>>>>>>>>>> tried  to setup bridged connection for the test 
setup , but
 >      >>>>>>>>> obviously
 >      >>>>>>>>>>> I'm missing something.
 >      >>>>>>>>>>>
 >      >>>>>>>>>>> Best Regards,
 >      >>>>>>>>>>> Strahil Nikolov
 >      >>>>>>>>>>>
 >      >>>>>>>>>> on the hosts (CentOS) the bridge config looks like 
that.The
 >      >>>>>>> bridging
 >      >>>>>>>>>> and configuration is handled by the virtualization 
software:
 >      >>>>>>>>>>
 >      >>>>>>>>>> # cat ifcfg-br0
 >      >>>>>>>>>> DEVICE=br0
 >      >>>>>>>>>> TYPE=Bridge
 >      >>>>>>>>>> BOOTPROTO=static
 >      >>>>>>>>>> ONBOOT=yes
 >      >>>>>>>>>> IPADDR=192.168.1.21
 >      >>>>>>>>>> NETMASK=255.255.0.0
 >      >>>>>>>>>> GATEWAY=192.168.1.1
 >      >>>>>>>>>> NM_CONTROLLED=no
 >      >>>>>>>>>> IPV6_AUTOCONF=yes
 >      >>>>>>>>>> IPV6_DEFROUTE=yes
 >      >>>>>>>>>> IPV6_PEERDNS=yes
 >      >>>>>>>>>> IPV6_PEERROUTES=yes
 >      >>>>>>>>>> IPV6_FAILURE_FATAL=no
 >      >>>>>>>>>>
 >      >>>>>>>>>>
 >      >>>>>>>>>>
 >      >>>>>>>>>> Am 15.07.2020 um 09:50 Klaus Wenninger wrote:
 >      >>>>>>>>>>> Guess it is not easy to have your servers connected
 >      >physically
 >      >>>>>>>>>>> for
 >      >>>>>>>>> a
 >      >>>>>>>>>> try.
 >      >>>>>>>>>>> But maybe you can at least try on one host to have
 >      >virt_fenced &
 >      >>>>>>> VM
 >      >>>>>>>>>>> on the same bridge - just to see if that basic 
pattern is
 >      >>>>>>>>>>> working.
 >      >>>>>>>>>> I am not sure if I understand you correctly. What do 
you by
 >      >having
 >      >>>>>>>>>> them on the same bridge? The bridge device is 
configured on
 >      >the
 >      >>>>>>> host
 >      >>>>>>>>>> by the virtualization software.
 >      >>>>>>>>> I meant to check out which bridge the interface of the 
VM is
 >      >>>>>>> enslaved
 >      >>>>>>>>> to and to use that bridge as interface in
 >      >/etc/fence_virt.conf.
 >      >>>>>>>>> Get me right - just for now - just to see if it is
 >     working for
 >      >this
 >      >>>>>>> one
 >      >>>>>>>>> host and the corresponding guest.
 >      >>>>>>>>>>
 >      >>>>>>>>>>> Well maybe still sbdy in the middle playing IGMPv3 
or the
 >      >request
 >      >>>>>>>>> for
 >      >>>>>>>>>>> a certain source is needed to shoot open some 
firewall or
 >      >>>>>>>>> switch-tables.
 >      >>>>>>>>>> I am still waiting for the final report from our Data
 >     Center
 >      >>>>>>>>>> techs.
 >      >>>>>>> I
 >      >>>>>>>>>> hope that will clear up somethings.
 >      >>>>>>>>>>
 >      >>>>>>>>>>
 >      >>>>>>>>>> Additionally  I have just noticed that apparently since
 >      >switching
 >      >>>>>>>>> from
 >      >>>>>>>>>> IGMPv3 to IGMPv2 and back the command "fence_xvm -a
 >      >225.0.0.12 -o
 >      >>>>>>>>>> list" is no completely broken.
 >      >>>>>>>>>> Before that switch this command at least returned the 
local
 >      >VM.
 >      >>>>>>>>>> Now
 >      >>>>>>>>> it
 >      >>>>>>>>>> returns:
 >      >>>>>>>>>> Timed out waiting for response
 >      >>>>>>>>>> Operation failed
 >      >>>>>>>>>>
 >      >>>>>>>>>> I am a bit confused by that, because all we did was 
running
 >      >>>>>>> commands
 >      >>>>>>>>>> like "sysctl -w net.ipv4.conf.all.force_igmp_version 
=" with
 >      >the
 >      >>>>>>>>>> different Version umbers and #cat /proc/net/igmp 
shows that
 >      >V3 is
 >      >>>>>>>>> used
 >      >>>>>>>>>> again on every device just like before...?!
 >      >>>>>>>>>>
 >      >>>>>>>>>> kind regards
 >      >>>>>>>>>> Stefan Schmitz
 >      >>>>>>>>>>
 >      >>>>>>>>>>
 >      >>>>>>>>>>> На 14 юли 2020 г. 11:06:42 GMT+03:00,
 >      >>>>>>>>>>> "stefan.schmitz at farmpartner-tec.com
 >     <mailto:stefan.schmitz at farmpartner-tec.com>"
 >      >>>>>>>>>>> <stefan.schmitz at farmpartner-tec.com
 >     <mailto:stefan.schmitz at farmpartner-tec.com>> написа:
 >      >>>>>>>>>>>> Hello,
 >      >>>>>>>>>>>>
 >      >>>>>>>>>>>>
 >      >>>>>>>>>>>> Am 09.07.2020 um 19:10 Strahil Nikolov wrote:
 >      >>>>>>>>>>>>> Have  you  run 'fence_virtd  -c' ?
 >      >>>>>>>>>>>> Yes I had run that on both Hosts. The current 
config looks
 >      >like
 >      >>>>>>>>> that
 >      >>>>>>>>>>>> and
 >      >>>>>>>>>>>> is identical on both.
 >      >>>>>>>>>>>>
 >      >>>>>>>>>>>> cat fence_virt.conf
 >      >>>>>>>>>>>> fence_virtd {
 >      >>>>>>>>>>>>             listener = "multicast";
 >      >>>>>>>>>>>>             backend = "libvirt";
 >      >>>>>>>>>>>>             module_path = "/usr/lib64/fence-virt";
 >      >>>>>>>>>>>> }
 >      >>>>>>>>>>>>
 >      >>>>>>>>>>>> listeners {
 >      >>>>>>>>>>>>             multicast {
 >      >>>>>>>>>>>>                     key_file =
 >      >"/etc/cluster/fence_xvm.key";
 >      >>>>>>>>>>>>                     address = "225.0.0.12";
 >      >>>>>>>>>>>>                     interface = "bond0";
 >      >>>>>>>>>>>>                     family = "ipv4";
 >      >>>>>>>>>>>>                     port = "1229";
 >      >>>>>>>>>>>>             }
 >      >>>>>>>>>>>>
 >      >>>>>>>>>>>> }
 >      >>>>>>>>>>>>
 >      >>>>>>>>>>>> backends {
 >      >>>>>>>>>>>>             libvirt {
 >      >>>>>>>>>>>>                     uri = "qemu:///system";
 >      >>>>>>>>>>>>             }
 >      >>>>>>>>>>>>
 >      >>>>>>>>>>>> }
 >      >>>>>>>>>>>>
 >      >>>>>>>>>>>>
 >      >>>>>>>>>>>> The situation is still that no matter on what host 
I issue
 >      >the
 >      >>>>>>>>>>>> "fence_xvm -a 225.0.0.12 -o list" command, both guest
 >      >systems
 >      >>>>>>>>> receive
 >      >>>>>>>>>>>> the traffic. The local guest, but also the guest on the
 >      >other
 >      >>>>>>> host.
 >      >>>>>>>>> I
 >      >>>>>>>>>>>> reckon that means the traffic is not filtered by any
 >      >network
 >      >>>>>>>>> device,
 >      >>>>>>>>>>>> like switches or firewalls. Since the guest on the 
other
 >      >host
 >      >>>>>>>>> receives
 >      >>>>>>>>>>>> the packages, the traffic must reach te physical
 >     server and
 >      >>>>>>>>>>>> networkdevice and is then routed to the VM on that 
host.
 >      >>>>>>>>>>>> But still, the traffic is not shown on the host itself.
 >      >>>>>>>>>>>>
 >      >>>>>>>>>>>> Further the local firewalls on both hosts are set 
to let
 >      >each
 >      >>>>>>>>>>>> and
 >      >>>>>>>>> every
 >      >>>>>>>>>>>> traffic pass. Accept to any and everything. Well at 
least
 >      >as far
 >      >>>>>>> as
 >      >>>>>>>>> I
 >      >>>>>>>>>>>> can see.
 >      >>>>>>>>>>>>
 >      >>>>>>>>>>>>
 >      >>>>>>>>>>>> Am 09.07.2020 um 22:34 Klaus Wenninger wrote:
 >      >>>>>>>>>>>>> makes me believe that
 >      >>>>>>>>>>>>> the whole setup doesn't lookas I would have
 >      >>>>>>>>>>>>> expected (bridges on each host where theguest
 >      >>>>>>>>>>>>> has a connection to and where ethernet interfaces
 >      >>>>>>>>>>>>> that connect the 2 hosts are part of as well
 >      >>>>>>>>>>>> On each physical server the networkcards are bonded to
 >      >achieve
 >      >>>>>>>>> failure
 >      >>>>>>>>>>>> safety (bond0). The guest are connected over a 
bridge(br0)
 >      >but
 >      >>>>>>>>>>>> apparently our virtualization softrware creates an own
 >      >device
 >      >>>>>>> named
 >      >>>>>>>>>>>> after the guest (kvm101.0).
 >      >>>>>>>>>>>> There is no direct connection between the servers, but
 >     as I
 >      >said
 >      >>>>>>>>>>>> earlier, the multicast traffic does reach the VMs so I
 >      >assume
 >      >>>>>>> there
 >      >>>>>>>>> is
 >      >>>>>>>>>>>> no problem with that.
 >      >>>>>>>>>>>>
 >      >>>>>>>>>>>>
 >      >>>>>>>>>>>> Am 09.07.2020 um 20:18 Vladislav Bogdanov wrote:
 >      >>>>>>>>>>>>> First, you need to ensure that your switch (or all
 >      >switches in
 >      >>>>>>> the
 >      >>>>>>>>>>>>> path) have igmp snooping enabled on host ports (and
 >      >probably
 >      >>>>>>>>>>>>> interconnects along the path between your hosts).
 >      >>>>>>>>>>>>>
 >      >>>>>>>>>>>>> Second, you need an igmp querier to be enabled 
somewhere
 >      >near
 >      >>>>>>>>> (better
 >      >>>>>>>>>>>>> to have it enabled on a switch itself). Please verify
 >     that
 >      >you
 >      >>>>>>> see
 >      >>>>>>>>>>>> its
 >      >>>>>>>>>>>>> queries on hosts.
 >      >>>>>>>>>>>>>
 >      >>>>>>>>>>>>> Next, you probably need to make your hosts to use 
IGMPv2
 >      >>>>>>>>>>>>> (not 3)
 >      >>>>>>>>> as
 >      >>>>>>>>>>>>> many switches still can not understand v3. This is 
doable
 >      >by
 >      >>>>>>>>> sysctl,
 >      >>>>>>>>>>>>> find on internet, there are many articles.
 >      >>>>>>>>>>>>
 >      >>>>>>>>>>>> I have send an query to our Data center Techs who are
 >      >analyzing
 >      >>>>>>>>> this
 >      >>>>>>>>>>>> and
 >      >>>>>>>>>>>> were already on it analyzing if multicast Traffic is
 >      >somewhere
 >      >>>>>>>>> blocked
 >      >>>>>>>>>>>> or hindered. So far the answer is, "multicast ist
 >     explictly
 >      >>>>>>> allowed
 >      >>>>>>>>> in
 >      >>>>>>>>>>>> the local network and no packets are filtered or 
dropped".
 >      >I am
 >      >>>>>>>>> still
 >      >>>>>>>>>>>> waiting for a final report though.
 >      >>>>>>>>>>>>
 >      >>>>>>>>>>>> In the meantime I have switched IGMPv3 to IGMPv2 on 
every
 >      >>>>>>> involved
 >      >>>>>>>>>>>> server, hosts and guests via the mentioned sysctl. The
 >      >switching
 >      >>>>>>>>> itself
 >      >>>>>>>>>>>> was successful, according to "cat /proc/net/igmp" but
 >     sadly
 >      >did
 >      >>>>>>> not
 >      >>>>>>>>>>>> better the behavior. It actually led to that no VM
 >     received
 >      >the
 >      >>>>>>>>>>>> multicast traffic anymore too.
 >      >>>>>>>>>>>>
 >      >>>>>>>>>>>> kind regards
 >      >>>>>>>>>>>> Stefan Schmitz
 >      >>>>>>>>>>>>
 >      >>>>>>>>>>>>
 >      >>>>>>>>>>>> Am 09.07.2020 um 22:34 schrieb Klaus Wenninger:
 >      >>>>>>>>>>>>> On 7/9/20 5:17 PM, stefan.schmitz at farmpartner-tec.com
 >     <mailto:stefan.schmitz at farmpartner-tec.com>
 >      >wrote:
 >      >>>>>>>>>>>>>> Hello,
 >      >>>>>>>>>>>>>>
 >      >>>>>>>>>>>>>>> Well, theory still holds I would say.
 >      >>>>>>>>>>>>>>>
 >      >>>>>>>>>>>>>>> I guess that the multicast-traffic from the 
other host
 >      >>>>>>>>>>>>>>> or the guestsdoesn't get to the daemon on the host.
 >      >>>>>>>>>>>>>>> Can't you just simply check if there are any 
firewall
 >      >>>>>>>>>>>>>>> rules configuredon the host kernel?
 >      >>>>>>>>>>>>>> I hope I did understand you corretcly and you are
 >      >referring to
 >      >>>>>>>>>>>> iptables?
 >      >>>>>>>>>>>>> I didn't say iptables because it might have been
 >      >>>>>>>>>>>>> nftables - but yesthat is what I was referring to.
 >      >>>>>>>>>>>>> Guess to understand the config the output is
 >      >>>>>>>>>>>>> lacking verbositybut it makes me believe that
 >      >>>>>>>>>>>>> the whole setup doesn't lookas I would have
 >      >>>>>>>>>>>>> expected (bridges on each host where theguest
 >      >>>>>>>>>>>>> has a connection to and where ethernet interfaces
 >      >>>>>>>>>>>>> that connect the 2 hosts are part of as well -
 >      >>>>>>>>>>>>> everythingconnected via layer 2 basically).
 >      >>>>>>>>>>>>>> Here is the output of the current rules. Besides 
the IP
 >      >of the
 >      >>>>>>>>> guest
 >      >>>>>>>>>>>>>> the output is identical on both hosts:
 >      >>>>>>>>>>>>>>
 >      >>>>>>>>>>>>>> # iptables -S
 >      >>>>>>>>>>>>>> -P INPUT ACCEPT
 >      >>>>>>>>>>>>>> -P FORWARD ACCEPT
 >      >>>>>>>>>>>>>> -P OUTPUT ACCEPT
 >      >>>>>>>>>>>>>>
 >      >>>>>>>>>>>>>> # iptables -L
 >      >>>>>>>>>>>>>> Chain INPUT (policy ACCEPT)
 >      >>>>>>>>>>>>>> target     prot opt source               destination
 >      >>>>>>>>>>>>>>
 >      >>>>>>>>>>>>>> Chain FORWARD (policy ACCEPT)
 >      >>>>>>>>>>>>>> target     prot opt source               destination
 >      >>>>>>>>>>>>>> SOLUSVM_TRAFFIC_IN  all  --  anywhere
 >      >anywhere
 >      >>>>>>>>>>>>>> SOLUSVM_TRAFFIC_OUT  all  --  anywhere
 >      >anywhere
 >      >>>>>>>>>>>>>>
 >      >>>>>>>>>>>>>> Chain OUTPUT (policy ACCEPT)
 >      >>>>>>>>>>>>>> target     prot opt source               destination
 >      >>>>>>>>>>>>>>
 >      >>>>>>>>>>>>>> Chain SOLUSVM_TRAFFIC_IN (1 references)
 >      >>>>>>>>>>>>>> target     prot opt source               destination
 >      >>>>>>>>>>>>>>                 all  --  anywhere
 >      >192.168.1.14
 >      >>>>>>>>>>>>>>
 >      >>>>>>>>>>>>>> Chain SOLUSVM_TRAFFIC_OUT (1 references)
 >      >>>>>>>>>>>>>> target     prot opt source               destination
 >      >>>>>>>>>>>>>>                 all  --  192.168.1.14 
anywhere
 >      >>>>>>>>>>>>>>
 >      >>>>>>>>>>>>>> kind regards
 >      >>>>>>>>>>>>>> Stefan Schmitz
 >      >>>>>>>>>>>>>>
 >      >>>>>>>>>>>>>>
 >      >>>>>> _______________________________________________
 >      >>>>>> Manage your subscription:
 >      >>>>>> https://lists.clusterlabs.org/mailman/listinfo/users
 >      >>>>>>
 >      >>>>>> ClusterLabs home: https://www.clusterlabs.org/
 >      >>>>>
 >      >>>>
 >      >>>
 >      >> _______________________________________________
 >      >> Manage your subscription:
 >      >> https://lists.clusterlabs.org/mailman/listinfo/users
 >      >>
 >      >> ClusterLabs home: https://www.clusterlabs.org/
 >      >_______________________________________________
 >      >Manage your subscription:
 >      >https://lists.clusterlabs.org/mailman/listinfo/users
 >      >
 >      >ClusterLabs home: https://www.clusterlabs.org/
 >     _______________________________________________
 >     Manage your subscription:
 >     https://lists.clusterlabs.org/mailman/listinfo/users
 >
 >     ClusterLabs home: https://www.clusterlabs.org/
 >
 >
 >
 > --
 > Regards,
 >
 > Reid Wahl, RHCA
 > Software Maintenance Engineer, Red Hat
 > CEE - Platform Support Delivery - ClusterHA