[ClusterLabs] Still Beginner STONITH Problem

Klaus Wenninger kwenning at redhat.com
Wed Jul 15 10:29:24 EDT 2020


On 7/15/20 4:21 PM, stefan.schmitz at farmpartner-tec.com wrote:
> Hello,
>
>
> Am 15.07.2020 um 15:30 schrieb Klaus Wenninger:
>> On 7/15/20 3:15 PM, Strahil Nikolov wrote:
>>> If it is created by libvirt - this is NAT and you will never
>>> receive  output  from the other  host.
>> And twice the same subnet behind NAT is probably giving
>> issues at other places as well.
>> And if using DHCP you have to at least enforce that both sides
>> don't go for the same IP at least.
>> But all no explanation why it doesn't work on the same host.
>> Which is why I was asking for running the service on the
>> bridge to check if that would work at least. So that we
>> can go forward step by step.
>
> I just now finished trying and testing it on both hosts.
> I ran # fence_virtd -c on both hosts and entered different network
> devices. On both I tried br0 and the kvm10x.0.
According to your libvirt-config I would have expected
the bridge to be virbr0.
>
> After each reconfiguration I ran #fence_xvm -a 225.0.0.12 -o list
> On the second server it worked with each device. After that I
> reconfigured back to the normal device, bond0, on which it did not
> work anymore, it worked now again!
> #  fence_xvm -a 225.0.0.12 -o list
> kvm102                           bab3749c-15fc-40b7-8b6c-d4267b9f0eb9 on
>
> But anyhow not on the first server, it did not work with any device.
> #  fence_xvm -a 225.0.0.12 -o list always resulted in
> Timed out waiting for response
> Operation failed
>
>
>
> Am 15.07.2020 um 15:15 schrieb Strahil Nikolov:
> > If it is created by libvirt - this is NAT and you will never receive
> output  from the other  host.
> >
> To my knowledge this is configured by libvirt. At least I am not aware
> having changend or configured it in any way. Up until today I did not
> even know that file existed. Could you please advise on what I need to
> do to fix this issue?
>
> Kind regards
>
>
>
>
>> Is pacemaker/corosync/knet btw. using the same interfaces/IPs?
>>
>> Klaus
>>>
>>> Best Regards,
>>> Strahil Nikolov
>>>
>>> На 15 юли 2020 г. 15:05:48 GMT+03:00,
>>> "stefan.schmitz at farmpartner-tec.com"
>>> <stefan.schmitz at farmpartner-tec.com> написа:
>>>> Hello,
>>>>
>>>> Am 15.07.2020 um 13:42 Strahil Nikolov wrote:
>>>>> By default libvirt is using NAT and not routed network - in such
>>>> case, vm1 won't receive data from host2.
>>>>> Can you provide the Networks' xml ?
>>>>>
>>>>> Best Regards,
>>>>> Strahil Nikolov
>>>>>
>>>> # cat default.xml
>>>> <network>
>>>>    <name>default</name>
>>>>    <bridge name="virbr0"/>
>>>>    <forward/>
>>>>    <ip address="192.168.122.1" netmask="255.255.255.0">
>>>>      <dhcp>
>>>>        <range start="192.168.122.2" end="192.168.122.254"/>
>>>>      </dhcp>
>>>>    </ip>
>>>> </network>
>>>>
>>>> I just checked this and the file is identical on both hosts.
>>>>
>>>> kind regards
>>>> Stefan Schmitz
>>>>
>>>>
>>>>> На 15 юли 2020 г. 13:19:59 GMT+03:00, Klaus Wenninger
>>>> <kwenning at redhat.com> написа:
>>>>>> On 7/15/20 11:42 AM, stefan.schmitz at farmpartner-tec.com wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>>
>>>>>>> Am 15.07.2020 um 06:32 Strahil Nikolov wrote:
>>>>>>>> How  did you configure the network on your ubuntu 20.04 Hosts ? I
>>>>>>>> tried  to setup bridged connection for the test setup , but
>>>>>> obviously
>>>>>>>> I'm missing something.
>>>>>>>>
>>>>>>>> Best Regards,
>>>>>>>> Strahil Nikolov
>>>>>>>>
>>>>>>> on the hosts (CentOS) the bridge config looks like that.The
>>>> bridging
>>>>>>> and configuration is handled by the virtualization software:
>>>>>>>
>>>>>>> # cat ifcfg-br0
>>>>>>> DEVICE=br0
>>>>>>> TYPE=Bridge
>>>>>>> BOOTPROTO=static
>>>>>>> ONBOOT=yes
>>>>>>> IPADDR=192.168.1.21
>>>>>>> NETMASK=255.255.0.0
>>>>>>> GATEWAY=192.168.1.1
>>>>>>> NM_CONTROLLED=no
>>>>>>> IPV6_AUTOCONF=yes
>>>>>>> IPV6_DEFROUTE=yes
>>>>>>> IPV6_PEERDNS=yes
>>>>>>> IPV6_PEERROUTES=yes
>>>>>>> IPV6_FAILURE_FATAL=no
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Am 15.07.2020 um 09:50 Klaus Wenninger wrote:
>>>>>>>> Guess it is not easy to have your servers connected physically for
>>>>>> a
>>>>>>> try.
>>>>>>>> But maybe you can at least try on one host to have virt_fenced &
>>>> VM
>>>>>>>> on the same bridge - just to see if that basic pattern is working.
>>>>>>> I am not sure if I understand you correctly. What do you by having
>>>>>>> them on the same bridge? The bridge device is configured on the
>>>> host
>>>>>>> by the virtualization software.
>>>>>> I meant to check out which bridge the interface of the VM is
>>>> enslaved
>>>>>> to and to use that bridge as interface in /etc/fence_virt.conf.
>>>>>> Get me right - just for now - just to see if it is working for this
>>>> one
>>>>>> host and the corresponding guest.
>>>>>>>
>>>>>>>> Well maybe still sbdy in the middle playing IGMPv3 or the request
>>>>>> for
>>>>>>>> a certain source is needed to shoot open some firewall or
>>>>>> switch-tables.
>>>>>>> I am still waiting for the final report from our Data Center techs.
>>>> I
>>>>>>> hope that will clear up somethings.
>>>>>>>
>>>>>>>
>>>>>>> Additionally  I have just noticed that apparently since switching
>>>>>> from
>>>>>>> IGMPv3 to IGMPv2 and back the command "fence_xvm -a 225.0.0.12 -o
>>>>>>> list" is no completely broken.
>>>>>>> Before that switch this command at least returned the local VM. Now
>>>>>> it
>>>>>>> returns:
>>>>>>> Timed out waiting for response
>>>>>>> Operation failed
>>>>>>>
>>>>>>> I am a bit confused by that, because all we did was running
>>>> commands
>>>>>>> like "sysctl -w net.ipv4.conf.all.force_igmp_version =" with the
>>>>>>> different Version umbers and #cat /proc/net/igmp shows that V3 is
>>>>>> used
>>>>>>> again on every device just like before...?!
>>>>>>>
>>>>>>> kind regards
>>>>>>> Stefan Schmitz
>>>>>>>
>>>>>>>
>>>>>>>> На 14 юли 2020 г. 11:06:42 GMT+03:00,
>>>>>>>> "stefan.schmitz at farmpartner-tec.com"
>>>>>>>> <stefan.schmitz at farmpartner-tec.com> написа:
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Am 09.07.2020 um 19:10 Strahil Nikolov wrote:
>>>>>>>>>> Have  you  run 'fence_virtd  -c' ?
>>>>>>>>> Yes I had run that on both Hosts. The current config looks like
>>>>>> that
>>>>>>>>> and
>>>>>>>>> is identical on both.
>>>>>>>>>
>>>>>>>>> cat fence_virt.conf
>>>>>>>>> fence_virtd {
>>>>>>>>>            listener = "multicast";
>>>>>>>>>            backend = "libvirt";
>>>>>>>>>            module_path = "/usr/lib64/fence-virt";
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> listeners {
>>>>>>>>>            multicast {
>>>>>>>>>                    key_file = "/etc/cluster/fence_xvm.key";
>>>>>>>>>                    address = "225.0.0.12";
>>>>>>>>>                    interface = "bond0";
>>>>>>>>>                    family = "ipv4";
>>>>>>>>>                    port = "1229";
>>>>>>>>>            }
>>>>>>>>>
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> backends {
>>>>>>>>>            libvirt {
>>>>>>>>>                    uri = "qemu:///system";
>>>>>>>>>            }
>>>>>>>>>
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The situation is still that no matter on what host I issue the
>>>>>>>>> "fence_xvm -a 225.0.0.12 -o list" command, both guest systems
>>>>>> receive
>>>>>>>>> the traffic. The local guest, but also the guest on the other
>>>> host.
>>>>>> I
>>>>>>>>> reckon that means the traffic is not filtered by any network
>>>>>> device,
>>>>>>>>> like switches or firewalls. Since the guest on the other host
>>>>>> receives
>>>>>>>>> the packages, the traffic must reach te physical server and
>>>>>>>>> networkdevice and is then routed to the VM on that host.
>>>>>>>>> But still, the traffic is not shown on the host itself.
>>>>>>>>>
>>>>>>>>> Further the local firewalls on both hosts are set to let each and
>>>>>> every
>>>>>>>>> traffic pass. Accept to any and everything. Well at least as far
>>>> as
>>>>>> I
>>>>>>>>> can see.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Am 09.07.2020 um 22:34 Klaus Wenninger wrote:
>>>>>>>>>> makes me believe that
>>>>>>>>>> the whole setup doesn't lookas I would have
>>>>>>>>>> expected (bridges on each host where theguest
>>>>>>>>>> has a connection to and where ethernet interfaces
>>>>>>>>>> that connect the 2 hosts are part of as well
>>>>>>>>> On each physical server the networkcards are bonded to achieve
>>>>>> failure
>>>>>>>>> safety (bond0). The guest are connected over a bridge(br0) but
>>>>>>>>> apparently our virtualization softrware creates an own device
>>>> named
>>>>>>>>> after the guest (kvm101.0).
>>>>>>>>> There is no direct connection between the servers, but as I said
>>>>>>>>> earlier, the multicast traffic does reach the VMs so I assume
>>>> there
>>>>>> is
>>>>>>>>> no problem with that.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Am 09.07.2020 um 20:18 Vladislav Bogdanov wrote:
>>>>>>>>>> First, you need to ensure that your switch (or all switches in
>>>> the
>>>>>>>>>> path) have igmp snooping enabled on host ports (and probably
>>>>>>>>>> interconnects along the path between your hosts).
>>>>>>>>>>
>>>>>>>>>> Second, you need an igmp querier to be enabled somewhere near
>>>>>> (better
>>>>>>>>>> to have it enabled on a switch itself). Please verify that you
>>>> see
>>>>>>>>> its
>>>>>>>>>> queries on hosts.
>>>>>>>>>>
>>>>>>>>>> Next, you probably need to make your hosts to use IGMPv2 (not 3)
>>>>>> as
>>>>>>>>>> many switches still can not understand v3. This is doable by
>>>>>> sysctl,
>>>>>>>>>> find on internet, there are many articles.
>>>>>>>>>
>>>>>>>>> I have send an query to our Data center Techs who are analyzing
>>>>>> this
>>>>>>>>> and
>>>>>>>>> were already on it analyzing if multicast Traffic is somewhere
>>>>>> blocked
>>>>>>>>> or hindered. So far the answer is, "multicast ist explictly
>>>> allowed
>>>>>> in
>>>>>>>>> the local network and no packets are filtered or dropped". I am
>>>>>> still
>>>>>>>>> waiting for a final report though.
>>>>>>>>>
>>>>>>>>> In the meantime I have switched IGMPv3 to IGMPv2 on every
>>>> involved
>>>>>>>>> server, hosts and guests via the mentioned sysctl. The switching
>>>>>> itself
>>>>>>>>> was successful, according to "cat /proc/net/igmp" but sadly did
>>>> not
>>>>>>>>> better the behavior. It actually led to that no VM received the
>>>>>>>>> multicast traffic anymore too.
>>>>>>>>>
>>>>>>>>> kind regards
>>>>>>>>> Stefan Schmitz
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Am 09.07.2020 um 22:34 schrieb Klaus Wenninger:
>>>>>>>>>> On 7/9/20 5:17 PM, stefan.schmitz at farmpartner-tec.com wrote:
>>>>>>>>>>> Hello,
>>>>>>>>>>>
>>>>>>>>>>>> Well, theory still holds I would say.
>>>>>>>>>>>>
>>>>>>>>>>>> I guess that the multicast-traffic from the other host
>>>>>>>>>>>> or the guestsdoesn't get to the daemon on the host.
>>>>>>>>>>>> Can't you just simply check if there are any firewall
>>>>>>>>>>>> rules configuredon the host kernel?
>>>>>>>>>>> I hope I did understand you corretcly and you are referring to
>>>>>>>>> iptables?
>>>>>>>>>> I didn't say iptables because it might have been
>>>>>>>>>> nftables - but yesthat is what I was referring to.
>>>>>>>>>> Guess to understand the config the output is
>>>>>>>>>> lacking verbositybut it makes me believe that
>>>>>>>>>> the whole setup doesn't lookas I would have
>>>>>>>>>> expected (bridges on each host where theguest
>>>>>>>>>> has a connection to and where ethernet interfaces
>>>>>>>>>> that connect the 2 hosts are part of as well -
>>>>>>>>>> everythingconnected via layer 2 basically).
>>>>>>>>>>> Here is the output of the current rules. Besides the IP of the
>>>>>> guest
>>>>>>>>>>> the output is identical on both hosts:
>>>>>>>>>>>
>>>>>>>>>>> # iptables -S
>>>>>>>>>>> -P INPUT ACCEPT
>>>>>>>>>>> -P FORWARD ACCEPT
>>>>>>>>>>> -P OUTPUT ACCEPT
>>>>>>>>>>>
>>>>>>>>>>> # iptables -L
>>>>>>>>>>> Chain INPUT (policy ACCEPT)
>>>>>>>>>>> target     prot opt source               destination
>>>>>>>>>>>
>>>>>>>>>>> Chain FORWARD (policy ACCEPT)
>>>>>>>>>>> target     prot opt source               destination
>>>>>>>>>>> SOLUSVM_TRAFFIC_IN  all  --  anywhere             anywhere
>>>>>>>>>>> SOLUSVM_TRAFFIC_OUT  all  --  anywhere             anywhere
>>>>>>>>>>>
>>>>>>>>>>> Chain OUTPUT (policy ACCEPT)
>>>>>>>>>>> target     prot opt source               destination
>>>>>>>>>>>
>>>>>>>>>>> Chain SOLUSVM_TRAFFIC_IN (1 references)
>>>>>>>>>>> target     prot opt source               destination
>>>>>>>>>>>                all  --  anywhere             192.168.1.14
>>>>>>>>>>>
>>>>>>>>>>> Chain SOLUSVM_TRAFFIC_OUT (1 references)
>>>>>>>>>>> target     prot opt source               destination
>>>>>>>>>>>                all  --  192.168.1.14         anywhere
>>>>>>>>>>>
>>>>>>>>>>> kind regards
>>>>>>>>>>> Stefan Schmitz
>>>>>>>>>>>
>>>>>>>>>>>
>>> _______________________________________________
>>> Manage your subscription:
>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>
>>> ClusterLabs home: https://www.clusterlabs.org/
>>
>



More information about the Users mailing list