[ClusterLabs] Still Beginner STONITH Problem
Strahil Nikolov
hunter86_bg at yahoo.com
Fri Jul 17 10:49:12 EDT 2020
The simplest way to check if the libvirt's network is NAT (or not) is to try to ssh from the first VM to the second one.
I should admit that I was lost when I tried to create a routed network in KVM, so I can't help with that.
Best Regards,
Strahil Nikolov
На 17 юли 2020 г. 16:56:44 GMT+03:00, "stefan.schmitz at farmpartner-tec.com" <stefan.schmitz at farmpartner-tec.com> написа:
>Hello,
>
>I have now managed to get # fence_xvm -a 225.0.0.12 -o list to list at
>least its local Guest again. It seems the fence_virtd was not working
>properly anymore.
>
>Regarding the Network XML config
>
># cat default.xml
> <network>
> <name>default</name>
> <bridge name="virbr0"/>
> <forward/>
> <ip address="192.168.122.1" netmask="255.255.255.0">
> <dhcp>
> <range start="192.168.122.2" end="192.168.122.254"/>
> </dhcp>
> </ip>
> </network>
>
>I have used "virsh net-edit default" to test other network Devices on
>the hosts but this did not change anything.
>
>Regarding the statement
>
> > If it is created by libvirt - this is NAT and you will never
> > receive output from the other host.
>
>I am at a loss an do not know why this is NAT. I am aware what NAT
>means, but what am I supposed to reconfigure here to dolve the problem?
>Any help would be greatly appreciated.
>Thank you in advance.
>
>Kind regards
>Stefan Schmitz
>
>
>Am 15.07.2020 um 16:48 schrieb stefan.schmitz at farmpartner-tec.com:
>>
>> Am 15.07.2020 um 16:29 schrieb Klaus Wenninger:
>>> On 7/15/20 4:21 PM, stefan.schmitz at farmpartner-tec.com wrote:
>>>> Hello,
>>>>
>>>>
>>>> Am 15.07.2020 um 15:30 schrieb Klaus Wenninger:
>>>>> On 7/15/20 3:15 PM, Strahil Nikolov wrote:
>>>>>> If it is created by libvirt - this is NAT and you will never
>>>>>> receive output from the other host.
>>>>> And twice the same subnet behind NAT is probably giving
>>>>> issues at other places as well.
>>>>> And if using DHCP you have to at least enforce that both sides
>>>>> don't go for the same IP at least.
>>>>> But all no explanation why it doesn't work on the same host.
>>>>> Which is why I was asking for running the service on the
>>>>> bridge to check if that would work at least. So that we
>>>>> can go forward step by step.
>>>>
>>>> I just now finished trying and testing it on both hosts.
>>>> I ran # fence_virtd -c on both hosts and entered different network
>>>> devices. On both I tried br0 and the kvm10x.0.
>>> According to your libvirt-config I would have expected
>>> the bridge to be virbr0.
>>
>> I understand that, but an "virbr0" Device does not seem to exist on
>any
>> of the two hosts.
>>
>> # ip link show
>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
>mode
>> DEFAULT group default qlen 1000
>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>> 2: eno1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq
>> master bond0 state UP mode DEFAULT group default qlen 1000
>> link/ether 0c:c4:7a:fb:30:1a brd ff:ff:ff:ff:ff:ff
>> 3: enp216s0f0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
>mode
>> DEFAULT group default qlen 1000
>> link/ether ac:1f:6b:26:69:dc brd ff:ff:ff:ff:ff:ff
>> 4: eno2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq
>> master bond0 state UP mode DEFAULT group default qlen 1000
>> link/ether 0c:c4:7a:fb:30:1a brd ff:ff:ff:ff:ff:ff
>> 5: enp216s0f1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
>mode
>> DEFAULT group default qlen 1000
>> link/ether ac:1f:6b:26:69:dd brd ff:ff:ff:ff:ff:ff
>> 6: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc
>> noqueue master br0 state UP mode DEFAULT group default qlen 1000
>> link/ether 0c:c4:7a:fb:30:1a brd ff:ff:ff:ff:ff:ff
>> 7: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
>state
>> UP mode DEFAULT group default qlen 1000
>> link/ether 0c:c4:7a:fb:30:1a brd ff:ff:ff:ff:ff:ff
>> 8: kvm101.0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
>pfifo_fast
>> master br0 state UNKNOWN mode DEFAULT group default qlen 1000
>> link/ether fe:16:3c:ba:10:6c brd ff:ff:ff:ff:ff:ff
>>
>>
>>
>>>>
>>>> After each reconfiguration I ran #fence_xvm -a 225.0.0.12 -o list
>>>> On the second server it worked with each device. After that I
>>>> reconfigured back to the normal device, bond0, on which it did not
>>>> work anymore, it worked now again!
>>>> # fence_xvm -a 225.0.0.12 -o list
>>>> kvm102
>bab3749c-15fc-40b7-8b6c-d4267b9f0eb9 on
>>>>
>>>> But anyhow not on the first server, it did not work with any
>device.
>>>> # fence_xvm -a 225.0.0.12 -o list always resulted in
>>>> Timed out waiting for response
>>>> Operation failed
>>>>
>>>>
>>>>
>>>> Am 15.07.2020 um 15:15 schrieb Strahil Nikolov:
>>>>> If it is created by libvirt - this is NAT and you will never
>receive
>>>> output from the other host.
>>>>>
>>>> To my knowledge this is configured by libvirt. At least I am not
>aware
>>>> having changend or configured it in any way. Up until today I did
>not
>>>> even know that file existed. Could you please advise on what I need
>to
>>>> do to fix this issue?
>>>>
>>>> Kind regards
>>>>
>>>>
>>>>
>>>>
>>>>> Is pacemaker/corosync/knet btw. using the same interfaces/IPs?
>>>>>
>>>>> Klaus
>>>>>>
>>>>>> Best Regards,
>>>>>> Strahil Nikolov
>>>>>>
>>>>>> На 15 юли 2020 г. 15:05:48 GMT+03:00,
>>>>>> "stefan.schmitz at farmpartner-tec.com"
>>>>>> <stefan.schmitz at farmpartner-tec.com> написа:
>>>>>>> Hello,
>>>>>>>
>>>>>>> Am 15.07.2020 um 13:42 Strahil Nikolov wrote:
>>>>>>>> By default libvirt is using NAT and not routed network - in
>such
>>>>>>> case, vm1 won't receive data from host2.
>>>>>>>> Can you provide the Networks' xml ?
>>>>>>>>
>>>>>>>> Best Regards,
>>>>>>>> Strahil Nikolov
>>>>>>>>
>>>>>>> # cat default.xml
>>>>>>> <network>
>>>>>>> <name>default</name>
>>>>>>> <bridge name="virbr0"/>
>>>>>>> <forward/>
>>>>>>> <ip address="192.168.122.1" netmask="255.255.255.0">
>>>>>>> <dhcp>
>>>>>>> <range start="192.168.122.2" end="192.168.122.254"/>
>>>>>>> </dhcp>
>>>>>>> </ip>
>>>>>>> </network>
>>>>>>>
>>>>>>> I just checked this and the file is identical on both hosts.
>>>>>>>
>>>>>>> kind regards
>>>>>>> Stefan Schmitz
>>>>>>>
>>>>>>>
>>>>>>>> На 15 юли 2020 г. 13:19:59 GMT+03:00, Klaus Wenninger
>>>>>>> <kwenning at redhat.com> написа:
>>>>>>>>> On 7/15/20 11:42 AM, stefan.schmitz at farmpartner-tec.com wrote:
>>>>>>>>>> Hello,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Am 15.07.2020 um 06:32 Strahil Nikolov wrote:
>>>>>>>>>>> How did you configure the network on your ubuntu 20.04
>Hosts ? I
>>>>>>>>>>> tried to setup bridged connection for the test setup , but
>>>>>>>>> obviously
>>>>>>>>>>> I'm missing something.
>>>>>>>>>>>
>>>>>>>>>>> Best Regards,
>>>>>>>>>>> Strahil Nikolov
>>>>>>>>>>>
>>>>>>>>>> on the hosts (CentOS) the bridge config looks like that.The
>>>>>>> bridging
>>>>>>>>>> and configuration is handled by the virtualization software:
>>>>>>>>>>
>>>>>>>>>> # cat ifcfg-br0
>>>>>>>>>> DEVICE=br0
>>>>>>>>>> TYPE=Bridge
>>>>>>>>>> BOOTPROTO=static
>>>>>>>>>> ONBOOT=yes
>>>>>>>>>> IPADDR=192.168.1.21
>>>>>>>>>> NETMASK=255.255.0.0
>>>>>>>>>> GATEWAY=192.168.1.1
>>>>>>>>>> NM_CONTROLLED=no
>>>>>>>>>> IPV6_AUTOCONF=yes
>>>>>>>>>> IPV6_DEFROUTE=yes
>>>>>>>>>> IPV6_PEERDNS=yes
>>>>>>>>>> IPV6_PEERROUTES=yes
>>>>>>>>>> IPV6_FAILURE_FATAL=no
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Am 15.07.2020 um 09:50 Klaus Wenninger wrote:
>>>>>>>>>>> Guess it is not easy to have your servers connected
>physically
>>>>>>>>>>> for
>>>>>>>>> a
>>>>>>>>>> try.
>>>>>>>>>>> But maybe you can at least try on one host to have
>virt_fenced &
>>>>>>> VM
>>>>>>>>>>> on the same bridge - just to see if that basic pattern is
>>>>>>>>>>> working.
>>>>>>>>>> I am not sure if I understand you correctly. What do you by
>having
>>>>>>>>>> them on the same bridge? The bridge device is configured on
>the
>>>>>>> host
>>>>>>>>>> by the virtualization software.
>>>>>>>>> I meant to check out which bridge the interface of the VM is
>>>>>>> enslaved
>>>>>>>>> to and to use that bridge as interface in
>/etc/fence_virt.conf.
>>>>>>>>> Get me right - just for now - just to see if it is working for
>this
>>>>>>> one
>>>>>>>>> host and the corresponding guest.
>>>>>>>>>>
>>>>>>>>>>> Well maybe still sbdy in the middle playing IGMPv3 or the
>request
>>>>>>>>> for
>>>>>>>>>>> a certain source is needed to shoot open some firewall or
>>>>>>>>> switch-tables.
>>>>>>>>>> I am still waiting for the final report from our Data Center
>>>>>>>>>> techs.
>>>>>>> I
>>>>>>>>>> hope that will clear up somethings.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Additionally I have just noticed that apparently since
>switching
>>>>>>>>> from
>>>>>>>>>> IGMPv3 to IGMPv2 and back the command "fence_xvm -a
>225.0.0.12 -o
>>>>>>>>>> list" is no completely broken.
>>>>>>>>>> Before that switch this command at least returned the local
>VM.
>>>>>>>>>> Now
>>>>>>>>> it
>>>>>>>>>> returns:
>>>>>>>>>> Timed out waiting for response
>>>>>>>>>> Operation failed
>>>>>>>>>>
>>>>>>>>>> I am a bit confused by that, because all we did was running
>>>>>>> commands
>>>>>>>>>> like "sysctl -w net.ipv4.conf.all.force_igmp_version =" with
>the
>>>>>>>>>> different Version umbers and #cat /proc/net/igmp shows that
>V3 is
>>>>>>>>> used
>>>>>>>>>> again on every device just like before...?!
>>>>>>>>>>
>>>>>>>>>> kind regards
>>>>>>>>>> Stefan Schmitz
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> На 14 юли 2020 г. 11:06:42 GMT+03:00,
>>>>>>>>>>> "stefan.schmitz at farmpartner-tec.com"
>>>>>>>>>>> <stefan.schmitz at farmpartner-tec.com> написа:
>>>>>>>>>>>> Hello,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Am 09.07.2020 um 19:10 Strahil Nikolov wrote:
>>>>>>>>>>>>> Have you run 'fence_virtd -c' ?
>>>>>>>>>>>> Yes I had run that on both Hosts. The current config looks
>like
>>>>>>>>> that
>>>>>>>>>>>> and
>>>>>>>>>>>> is identical on both.
>>>>>>>>>>>>
>>>>>>>>>>>> cat fence_virt.conf
>>>>>>>>>>>> fence_virtd {
>>>>>>>>>>>> listener = "multicast";
>>>>>>>>>>>> backend = "libvirt";
>>>>>>>>>>>> module_path = "/usr/lib64/fence-virt";
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> listeners {
>>>>>>>>>>>> multicast {
>>>>>>>>>>>> key_file =
>"/etc/cluster/fence_xvm.key";
>>>>>>>>>>>> address = "225.0.0.12";
>>>>>>>>>>>> interface = "bond0";
>>>>>>>>>>>> family = "ipv4";
>>>>>>>>>>>> port = "1229";
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> backends {
>>>>>>>>>>>> libvirt {
>>>>>>>>>>>> uri = "qemu:///system";
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> The situation is still that no matter on what host I issue
>the
>>>>>>>>>>>> "fence_xvm -a 225.0.0.12 -o list" command, both guest
>systems
>>>>>>>>> receive
>>>>>>>>>>>> the traffic. The local guest, but also the guest on the
>other
>>>>>>> host.
>>>>>>>>> I
>>>>>>>>>>>> reckon that means the traffic is not filtered by any
>network
>>>>>>>>> device,
>>>>>>>>>>>> like switches or firewalls. Since the guest on the other
>host
>>>>>>>>> receives
>>>>>>>>>>>> the packages, the traffic must reach te physical server and
>>>>>>>>>>>> networkdevice and is then routed to the VM on that host.
>>>>>>>>>>>> But still, the traffic is not shown on the host itself.
>>>>>>>>>>>>
>>>>>>>>>>>> Further the local firewalls on both hosts are set to let
>each
>>>>>>>>>>>> and
>>>>>>>>> every
>>>>>>>>>>>> traffic pass. Accept to any and everything. Well at least
>as far
>>>>>>> as
>>>>>>>>> I
>>>>>>>>>>>> can see.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Am 09.07.2020 um 22:34 Klaus Wenninger wrote:
>>>>>>>>>>>>> makes me believe that
>>>>>>>>>>>>> the whole setup doesn't lookas I would have
>>>>>>>>>>>>> expected (bridges on each host where theguest
>>>>>>>>>>>>> has a connection to and where ethernet interfaces
>>>>>>>>>>>>> that connect the 2 hosts are part of as well
>>>>>>>>>>>> On each physical server the networkcards are bonded to
>achieve
>>>>>>>>> failure
>>>>>>>>>>>> safety (bond0). The guest are connected over a bridge(br0)
>but
>>>>>>>>>>>> apparently our virtualization softrware creates an own
>device
>>>>>>> named
>>>>>>>>>>>> after the guest (kvm101.0).
>>>>>>>>>>>> There is no direct connection between the servers, but as I
>said
>>>>>>>>>>>> earlier, the multicast traffic does reach the VMs so I
>assume
>>>>>>> there
>>>>>>>>> is
>>>>>>>>>>>> no problem with that.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Am 09.07.2020 um 20:18 Vladislav Bogdanov wrote:
>>>>>>>>>>>>> First, you need to ensure that your switch (or all
>switches in
>>>>>>> the
>>>>>>>>>>>>> path) have igmp snooping enabled on host ports (and
>probably
>>>>>>>>>>>>> interconnects along the path between your hosts).
>>>>>>>>>>>>>
>>>>>>>>>>>>> Second, you need an igmp querier to be enabled somewhere
>near
>>>>>>>>> (better
>>>>>>>>>>>>> to have it enabled on a switch itself). Please verify that
>you
>>>>>>> see
>>>>>>>>>>>> its
>>>>>>>>>>>>> queries on hosts.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Next, you probably need to make your hosts to use IGMPv2
>>>>>>>>>>>>> (not 3)
>>>>>>>>> as
>>>>>>>>>>>>> many switches still can not understand v3. This is doable
>by
>>>>>>>>> sysctl,
>>>>>>>>>>>>> find on internet, there are many articles.
>>>>>>>>>>>>
>>>>>>>>>>>> I have send an query to our Data center Techs who are
>analyzing
>>>>>>>>> this
>>>>>>>>>>>> and
>>>>>>>>>>>> were already on it analyzing if multicast Traffic is
>somewhere
>>>>>>>>> blocked
>>>>>>>>>>>> or hindered. So far the answer is, "multicast ist explictly
>>>>>>> allowed
>>>>>>>>> in
>>>>>>>>>>>> the local network and no packets are filtered or dropped".
>I am
>>>>>>>>> still
>>>>>>>>>>>> waiting for a final report though.
>>>>>>>>>>>>
>>>>>>>>>>>> In the meantime I have switched IGMPv3 to IGMPv2 on every
>>>>>>> involved
>>>>>>>>>>>> server, hosts and guests via the mentioned sysctl. The
>switching
>>>>>>>>> itself
>>>>>>>>>>>> was successful, according to "cat /proc/net/igmp" but sadly
>did
>>>>>>> not
>>>>>>>>>>>> better the behavior. It actually led to that no VM received
>the
>>>>>>>>>>>> multicast traffic anymore too.
>>>>>>>>>>>>
>>>>>>>>>>>> kind regards
>>>>>>>>>>>> Stefan Schmitz
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Am 09.07.2020 um 22:34 schrieb Klaus Wenninger:
>>>>>>>>>>>>> On 7/9/20 5:17 PM, stefan.schmitz at farmpartner-tec.com
>wrote:
>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Well, theory still holds I would say.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I guess that the multicast-traffic from the other host
>>>>>>>>>>>>>>> or the guestsdoesn't get to the daemon on the host.
>>>>>>>>>>>>>>> Can't you just simply check if there are any firewall
>>>>>>>>>>>>>>> rules configuredon the host kernel?
>>>>>>>>>>>>>> I hope I did understand you corretcly and you are
>referring to
>>>>>>>>>>>> iptables?
>>>>>>>>>>>>> I didn't say iptables because it might have been
>>>>>>>>>>>>> nftables - but yesthat is what I was referring to.
>>>>>>>>>>>>> Guess to understand the config the output is
>>>>>>>>>>>>> lacking verbositybut it makes me believe that
>>>>>>>>>>>>> the whole setup doesn't lookas I would have
>>>>>>>>>>>>> expected (bridges on each host where theguest
>>>>>>>>>>>>> has a connection to and where ethernet interfaces
>>>>>>>>>>>>> that connect the 2 hosts are part of as well -
>>>>>>>>>>>>> everythingconnected via layer 2 basically).
>>>>>>>>>>>>>> Here is the output of the current rules. Besides the IP
>of the
>>>>>>>>> guest
>>>>>>>>>>>>>> the output is identical on both hosts:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # iptables -S
>>>>>>>>>>>>>> -P INPUT ACCEPT
>>>>>>>>>>>>>> -P FORWARD ACCEPT
>>>>>>>>>>>>>> -P OUTPUT ACCEPT
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # iptables -L
>>>>>>>>>>>>>> Chain INPUT (policy ACCEPT)
>>>>>>>>>>>>>> target prot opt source destination
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Chain FORWARD (policy ACCEPT)
>>>>>>>>>>>>>> target prot opt source destination
>>>>>>>>>>>>>> SOLUSVM_TRAFFIC_IN all -- anywhere
>anywhere
>>>>>>>>>>>>>> SOLUSVM_TRAFFIC_OUT all -- anywhere
>anywhere
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Chain OUTPUT (policy ACCEPT)
>>>>>>>>>>>>>> target prot opt source destination
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Chain SOLUSVM_TRAFFIC_IN (1 references)
>>>>>>>>>>>>>> target prot opt source destination
>>>>>>>>>>>>>> all -- anywhere
>192.168.1.14
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Chain SOLUSVM_TRAFFIC_OUT (1 references)
>>>>>>>>>>>>>> target prot opt source destination
>>>>>>>>>>>>>> all -- 192.168.1.14 anywhere
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> kind regards
>>>>>>>>>>>>>> Stefan Schmitz
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>> _______________________________________________
>>>>>> Manage your subscription:
>>>>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>>>>
>>>>>> ClusterLabs home: https://www.clusterlabs.org/
>>>>>
>>>>
>>>
>> _______________________________________________
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users
>>
>> ClusterLabs home: https://www.clusterlabs.org/
>_______________________________________________
>Manage your subscription:
>https://lists.clusterlabs.org/mailman/listinfo/users
>
>ClusterLabs home: https://www.clusterlabs.org/
More information about the Users
mailing list