[ClusterLabs] Still Beginner STONITH Problem

stefan.schmitz at farmpartner-tec.com stefan.schmitz at farmpartner-tec.com
Wed Jul 15 10:48:34 EDT 2020


Am 15.07.2020 um 16:29 schrieb Klaus Wenninger:
> On 7/15/20 4:21 PM, stefan.schmitz at farmpartner-tec.com wrote:
>> Hello,
>>
>>
>> Am 15.07.2020 um 15:30 schrieb Klaus Wenninger:
>>> On 7/15/20 3:15 PM, Strahil Nikolov wrote:
>>>> If it is created by libvirt - this is NAT and you will never
>>>> receive  output  from the other  host.
>>> And twice the same subnet behind NAT is probably giving
>>> issues at other places as well.
>>> And if using DHCP you have to at least enforce that both sides
>>> don't go for the same IP at least.
>>> But all no explanation why it doesn't work on the same host.
>>> Which is why I was asking for running the service on the
>>> bridge to check if that would work at least. So that we
>>> can go forward step by step.
>>
>> I just now finished trying and testing it on both hosts.
>> I ran # fence_virtd -c on both hosts and entered different network
>> devices. On both I tried br0 and the kvm10x.0.
> According to your libvirt-config I would have expected
> the bridge to be virbr0.

I understand that, but an "virbr0" Device does not seem to exist on any 
of the two hosts.

# ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode 
DEFAULT group default qlen 1000
     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eno1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq 
master bond0 state UP mode DEFAULT group default qlen 1000
     link/ether 0c:c4:7a:fb:30:1a brd ff:ff:ff:ff:ff:ff
3: enp216s0f0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode 
DEFAULT group default qlen 1000
     link/ether ac:1f:6b:26:69:dc brd ff:ff:ff:ff:ff:ff
4: eno2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq 
master bond0 state UP mode DEFAULT group default qlen 1000
     link/ether 0c:c4:7a:fb:30:1a brd ff:ff:ff:ff:ff:ff
5: enp216s0f1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode 
DEFAULT group default qlen 1000
     link/ether ac:1f:6b:26:69:dd brd ff:ff:ff:ff:ff:ff
6: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc 
noqueue master br0 state UP mode DEFAULT group default qlen 1000
     link/ether 0c:c4:7a:fb:30:1a brd ff:ff:ff:ff:ff:ff
7: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state 
UP mode DEFAULT group default qlen 1000
     link/ether 0c:c4:7a:fb:30:1a brd ff:ff:ff:ff:ff:ff
8: kvm101.0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast 
master br0 state UNKNOWN mode DEFAULT group default qlen 1000
     link/ether fe:16:3c:ba:10:6c brd ff:ff:ff:ff:ff:ff



>>
>> After each reconfiguration I ran #fence_xvm -a 225.0.0.12 -o list
>> On the second server it worked with each device. After that I
>> reconfigured back to the normal device, bond0, on which it did not
>> work anymore, it worked now again!
>> #  fence_xvm -a 225.0.0.12 -o list
>> kvm102                           bab3749c-15fc-40b7-8b6c-d4267b9f0eb9 on
>>
>> But anyhow not on the first server, it did not work with any device.
>> #  fence_xvm -a 225.0.0.12 -o list always resulted in
>> Timed out waiting for response
>> Operation failed
>>
>>
>>
>> Am 15.07.2020 um 15:15 schrieb Strahil Nikolov:
>>> If it is created by libvirt - this is NAT and you will never receive
>> output  from the other  host.
>>>
>> To my knowledge this is configured by libvirt. At least I am not aware
>> having changend or configured it in any way. Up until today I did not
>> even know that file existed. Could you please advise on what I need to
>> do to fix this issue?
>>
>> Kind regards
>>
>>
>>
>>
>>> Is pacemaker/corosync/knet btw. using the same interfaces/IPs?
>>>
>>> Klaus
>>>>
>>>> Best Regards,
>>>> Strahil Nikolov
>>>>
>>>> На 15 юли 2020 г. 15:05:48 GMT+03:00,
>>>> "stefan.schmitz at farmpartner-tec.com"
>>>> <stefan.schmitz at farmpartner-tec.com> написа:
>>>>> Hello,
>>>>>
>>>>> Am 15.07.2020 um 13:42 Strahil Nikolov wrote:
>>>>>> By default libvirt is using NAT and not routed network - in such
>>>>> case, vm1 won't receive data from host2.
>>>>>> Can you provide the Networks' xml ?
>>>>>>
>>>>>> Best Regards,
>>>>>> Strahil Nikolov
>>>>>>
>>>>> # cat default.xml
>>>>> <network>
>>>>>     <name>default</name>
>>>>>     <bridge name="virbr0"/>
>>>>>     <forward/>
>>>>>     <ip address="192.168.122.1" netmask="255.255.255.0">
>>>>>       <dhcp>
>>>>>         <range start="192.168.122.2" end="192.168.122.254"/>
>>>>>       </dhcp>
>>>>>     </ip>
>>>>> </network>
>>>>>
>>>>> I just checked this and the file is identical on both hosts.
>>>>>
>>>>> kind regards
>>>>> Stefan Schmitz
>>>>>
>>>>>
>>>>>> На 15 юли 2020 г. 13:19:59 GMT+03:00, Klaus Wenninger
>>>>> <kwenning at redhat.com> написа:
>>>>>>> On 7/15/20 11:42 AM, stefan.schmitz at farmpartner-tec.com wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>>
>>>>>>>> Am 15.07.2020 um 06:32 Strahil Nikolov wrote:
>>>>>>>>> How  did you configure the network on your ubuntu 20.04 Hosts ? I
>>>>>>>>> tried  to setup bridged connection for the test setup , but
>>>>>>> obviously
>>>>>>>>> I'm missing something.
>>>>>>>>>
>>>>>>>>> Best Regards,
>>>>>>>>> Strahil Nikolov
>>>>>>>>>
>>>>>>>> on the hosts (CentOS) the bridge config looks like that.The
>>>>> bridging
>>>>>>>> and configuration is handled by the virtualization software:
>>>>>>>>
>>>>>>>> # cat ifcfg-br0
>>>>>>>> DEVICE=br0
>>>>>>>> TYPE=Bridge
>>>>>>>> BOOTPROTO=static
>>>>>>>> ONBOOT=yes
>>>>>>>> IPADDR=192.168.1.21
>>>>>>>> NETMASK=255.255.0.0
>>>>>>>> GATEWAY=192.168.1.1
>>>>>>>> NM_CONTROLLED=no
>>>>>>>> IPV6_AUTOCONF=yes
>>>>>>>> IPV6_DEFROUTE=yes
>>>>>>>> IPV6_PEERDNS=yes
>>>>>>>> IPV6_PEERROUTES=yes
>>>>>>>> IPV6_FAILURE_FATAL=no
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Am 15.07.2020 um 09:50 Klaus Wenninger wrote:
>>>>>>>>> Guess it is not easy to have your servers connected physically for
>>>>>>> a
>>>>>>>> try.
>>>>>>>>> But maybe you can at least try on one host to have virt_fenced &
>>>>> VM
>>>>>>>>> on the same bridge - just to see if that basic pattern is working.
>>>>>>>> I am not sure if I understand you correctly. What do you by having
>>>>>>>> them on the same bridge? The bridge device is configured on the
>>>>> host
>>>>>>>> by the virtualization software.
>>>>>>> I meant to check out which bridge the interface of the VM is
>>>>> enslaved
>>>>>>> to and to use that bridge as interface in /etc/fence_virt.conf.
>>>>>>> Get me right - just for now - just to see if it is working for this
>>>>> one
>>>>>>> host and the corresponding guest.
>>>>>>>>
>>>>>>>>> Well maybe still sbdy in the middle playing IGMPv3 or the request
>>>>>>> for
>>>>>>>>> a certain source is needed to shoot open some firewall or
>>>>>>> switch-tables.
>>>>>>>> I am still waiting for the final report from our Data Center techs.
>>>>> I
>>>>>>>> hope that will clear up somethings.
>>>>>>>>
>>>>>>>>
>>>>>>>> Additionally  I have just noticed that apparently since switching
>>>>>>> from
>>>>>>>> IGMPv3 to IGMPv2 and back the command "fence_xvm -a 225.0.0.12 -o
>>>>>>>> list" is no completely broken.
>>>>>>>> Before that switch this command at least returned the local VM. Now
>>>>>>> it
>>>>>>>> returns:
>>>>>>>> Timed out waiting for response
>>>>>>>> Operation failed
>>>>>>>>
>>>>>>>> I am a bit confused by that, because all we did was running
>>>>> commands
>>>>>>>> like "sysctl -w net.ipv4.conf.all.force_igmp_version =" with the
>>>>>>>> different Version umbers and #cat /proc/net/igmp shows that V3 is
>>>>>>> used
>>>>>>>> again on every device just like before...?!
>>>>>>>>
>>>>>>>> kind regards
>>>>>>>> Stefan Schmitz
>>>>>>>>
>>>>>>>>
>>>>>>>>> На 14 юли 2020 г. 11:06:42 GMT+03:00,
>>>>>>>>> "stefan.schmitz at farmpartner-tec.com"
>>>>>>>>> <stefan.schmitz at farmpartner-tec.com> написа:
>>>>>>>>>> Hello,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Am 09.07.2020 um 19:10 Strahil Nikolov wrote:
>>>>>>>>>>> Have  you  run 'fence_virtd  -c' ?
>>>>>>>>>> Yes I had run that on both Hosts. The current config looks like
>>>>>>> that
>>>>>>>>>> and
>>>>>>>>>> is identical on both.
>>>>>>>>>>
>>>>>>>>>> cat fence_virt.conf
>>>>>>>>>> fence_virtd {
>>>>>>>>>>             listener = "multicast";
>>>>>>>>>>             backend = "libvirt";
>>>>>>>>>>             module_path = "/usr/lib64/fence-virt";
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> listeners {
>>>>>>>>>>             multicast {
>>>>>>>>>>                     key_file = "/etc/cluster/fence_xvm.key";
>>>>>>>>>>                     address = "225.0.0.12";
>>>>>>>>>>                     interface = "bond0";
>>>>>>>>>>                     family = "ipv4";
>>>>>>>>>>                     port = "1229";
>>>>>>>>>>             }
>>>>>>>>>>
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> backends {
>>>>>>>>>>             libvirt {
>>>>>>>>>>                     uri = "qemu:///system";
>>>>>>>>>>             }
>>>>>>>>>>
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> The situation is still that no matter on what host I issue the
>>>>>>>>>> "fence_xvm -a 225.0.0.12 -o list" command, both guest systems
>>>>>>> receive
>>>>>>>>>> the traffic. The local guest, but also the guest on the other
>>>>> host.
>>>>>>> I
>>>>>>>>>> reckon that means the traffic is not filtered by any network
>>>>>>> device,
>>>>>>>>>> like switches or firewalls. Since the guest on the other host
>>>>>>> receives
>>>>>>>>>> the packages, the traffic must reach te physical server and
>>>>>>>>>> networkdevice and is then routed to the VM on that host.
>>>>>>>>>> But still, the traffic is not shown on the host itself.
>>>>>>>>>>
>>>>>>>>>> Further the local firewalls on both hosts are set to let each and
>>>>>>> every
>>>>>>>>>> traffic pass. Accept to any and everything. Well at least as far
>>>>> as
>>>>>>> I
>>>>>>>>>> can see.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Am 09.07.2020 um 22:34 Klaus Wenninger wrote:
>>>>>>>>>>> makes me believe that
>>>>>>>>>>> the whole setup doesn't lookas I would have
>>>>>>>>>>> expected (bridges on each host where theguest
>>>>>>>>>>> has a connection to and where ethernet interfaces
>>>>>>>>>>> that connect the 2 hosts are part of as well
>>>>>>>>>> On each physical server the networkcards are bonded to achieve
>>>>>>> failure
>>>>>>>>>> safety (bond0). The guest are connected over a bridge(br0) but
>>>>>>>>>> apparently our virtualization softrware creates an own device
>>>>> named
>>>>>>>>>> after the guest (kvm101.0).
>>>>>>>>>> There is no direct connection between the servers, but as I said
>>>>>>>>>> earlier, the multicast traffic does reach the VMs so I assume
>>>>> there
>>>>>>> is
>>>>>>>>>> no problem with that.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Am 09.07.2020 um 20:18 Vladislav Bogdanov wrote:
>>>>>>>>>>> First, you need to ensure that your switch (or all switches in
>>>>> the
>>>>>>>>>>> path) have igmp snooping enabled on host ports (and probably
>>>>>>>>>>> interconnects along the path between your hosts).
>>>>>>>>>>>
>>>>>>>>>>> Second, you need an igmp querier to be enabled somewhere near
>>>>>>> (better
>>>>>>>>>>> to have it enabled on a switch itself). Please verify that you
>>>>> see
>>>>>>>>>> its
>>>>>>>>>>> queries on hosts.
>>>>>>>>>>>
>>>>>>>>>>> Next, you probably need to make your hosts to use IGMPv2 (not 3)
>>>>>>> as
>>>>>>>>>>> many switches still can not understand v3. This is doable by
>>>>>>> sysctl,
>>>>>>>>>>> find on internet, there are many articles.
>>>>>>>>>>
>>>>>>>>>> I have send an query to our Data center Techs who are analyzing
>>>>>>> this
>>>>>>>>>> and
>>>>>>>>>> were already on it analyzing if multicast Traffic is somewhere
>>>>>>> blocked
>>>>>>>>>> or hindered. So far the answer is, "multicast ist explictly
>>>>> allowed
>>>>>>> in
>>>>>>>>>> the local network and no packets are filtered or dropped". I am
>>>>>>> still
>>>>>>>>>> waiting for a final report though.
>>>>>>>>>>
>>>>>>>>>> In the meantime I have switched IGMPv3 to IGMPv2 on every
>>>>> involved
>>>>>>>>>> server, hosts and guests via the mentioned sysctl. The switching
>>>>>>> itself
>>>>>>>>>> was successful, according to "cat /proc/net/igmp" but sadly did
>>>>> not
>>>>>>>>>> better the behavior. It actually led to that no VM received the
>>>>>>>>>> multicast traffic anymore too.
>>>>>>>>>>
>>>>>>>>>> kind regards
>>>>>>>>>> Stefan Schmitz
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Am 09.07.2020 um 22:34 schrieb Klaus Wenninger:
>>>>>>>>>>> On 7/9/20 5:17 PM, stefan.schmitz at farmpartner-tec.com wrote:
>>>>>>>>>>>> Hello,
>>>>>>>>>>>>
>>>>>>>>>>>>> Well, theory still holds I would say.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I guess that the multicast-traffic from the other host
>>>>>>>>>>>>> or the guestsdoesn't get to the daemon on the host.
>>>>>>>>>>>>> Can't you just simply check if there are any firewall
>>>>>>>>>>>>> rules configuredon the host kernel?
>>>>>>>>>>>> I hope I did understand you corretcly and you are referring to
>>>>>>>>>> iptables?
>>>>>>>>>>> I didn't say iptables because it might have been
>>>>>>>>>>> nftables - but yesthat is what I was referring to.
>>>>>>>>>>> Guess to understand the config the output is
>>>>>>>>>>> lacking verbositybut it makes me believe that
>>>>>>>>>>> the whole setup doesn't lookas I would have
>>>>>>>>>>> expected (bridges on each host where theguest
>>>>>>>>>>> has a connection to and where ethernet interfaces
>>>>>>>>>>> that connect the 2 hosts are part of as well -
>>>>>>>>>>> everythingconnected via layer 2 basically).
>>>>>>>>>>>> Here is the output of the current rules. Besides the IP of the
>>>>>>> guest
>>>>>>>>>>>> the output is identical on both hosts:
>>>>>>>>>>>>
>>>>>>>>>>>> # iptables -S
>>>>>>>>>>>> -P INPUT ACCEPT
>>>>>>>>>>>> -P FORWARD ACCEPT
>>>>>>>>>>>> -P OUTPUT ACCEPT
>>>>>>>>>>>>
>>>>>>>>>>>> # iptables -L
>>>>>>>>>>>> Chain INPUT (policy ACCEPT)
>>>>>>>>>>>> target     prot opt source               destination
>>>>>>>>>>>>
>>>>>>>>>>>> Chain FORWARD (policy ACCEPT)
>>>>>>>>>>>> target     prot opt source               destination
>>>>>>>>>>>> SOLUSVM_TRAFFIC_IN  all  --  anywhere             anywhere
>>>>>>>>>>>> SOLUSVM_TRAFFIC_OUT  all  --  anywhere             anywhere
>>>>>>>>>>>>
>>>>>>>>>>>> Chain OUTPUT (policy ACCEPT)
>>>>>>>>>>>> target     prot opt source               destination
>>>>>>>>>>>>
>>>>>>>>>>>> Chain SOLUSVM_TRAFFIC_IN (1 references)
>>>>>>>>>>>> target     prot opt source               destination
>>>>>>>>>>>>                 all  --  anywhere             192.168.1.14
>>>>>>>>>>>>
>>>>>>>>>>>> Chain SOLUSVM_TRAFFIC_OUT (1 references)
>>>>>>>>>>>> target     prot opt source               destination
>>>>>>>>>>>>                 all  --  192.168.1.14         anywhere
>>>>>>>>>>>>
>>>>>>>>>>>> kind regards
>>>>>>>>>>>> Stefan Schmitz
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>> _______________________________________________
>>>> Manage your subscription:
>>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>>
>>>> ClusterLabs home: https://www.clusterlabs.org/
>>>
>>
> 


More information about the Users mailing list