[ClusterLabs] Still Beginner STONITH Problem
stefan.schmitz at farmpartner-tec.com
stefan.schmitz at farmpartner-tec.com
Mon Jul 6 04:10:20 EDT 2020
Hello,
>> # fence_xvm -o list
>> kvm102 bab3749c-15fc-40b7-8b6c-d4267b9f0eb9
>> on
>This should show both VMs, so getting to that point will likely solve
>your problem. fence_xvm relies on multicast, there could be some
>obscure network configuration to get that working on the VMs.
Thank you for pointing me in that direction. We have tried to solve that
but with no success. We were using an howto provided here
https://wiki.clusterlabs.org/wiki/Guest_Fencing
Problem is, it specifically states that the tutorial does not yet
support the case where guests are running on multiple hosts. There are
some short hints what might be necessary to do, but working through
those sadly just did not work nor where there any clues which would help
us finding a solution ourselves. So now we are completely stuck here.
Has someone the same configuration with Guest VMs on multiple hosts? And
how did you manage to get that to work? What do we need to do to resolve
this? Is there maybe even someone who would be willing to take a closer
look at our server? Any help would be greatly appreciated!
Kind regards
Stefan Schmitz
Am 03.07.2020 um 02:39 schrieb Ken Gaillot:
> On Thu, 2020-07-02 at 17:18 +0200, stefan.schmitz at farmpartner-tec.com
> wrote:
>> Hello,
>>
>> I hope someone can help with this problem. We are (still) trying to
>> get
>> Stonith to achieve a running active/active HA Cluster, but sadly to
>> no
>> avail.
>>
>> There are 2 Centos Hosts. On each one there is a virtual Ubuntu VM.
>> The
>> Ubuntu VMs are the ones which should form the HA Cluster.
>>
>> The current status is this:
>>
>> # pcs status
>> Cluster name: pacemaker_cluster
>> WARNING: corosync and pacemaker node names do not match (IPs used in
>> setup?)
>> Stack: corosync
>> Current DC: server2ubuntu1 (version 1.1.18-2b07d5c5a9) - partition
>> with
>> quorum
>> Last updated: Thu Jul 2 17:03:53 2020
>> Last change: Thu Jul 2 14:33:14 2020 by root via cibadmin on
>> server4ubuntu1
>>
>> 2 nodes configured
>> 13 resources configured
>>
>> Online: [ server2ubuntu1 server4ubuntu1 ]
>>
>> Full list of resources:
>>
>> stonith_id_1 (stonith:external/libvirt): Stopped
>> Master/Slave Set: r0_pacemaker_Clone [r0_pacemaker]
>> Masters: [ server4ubuntu1 ]
>> Slaves: [ server2ubuntu1 ]
>> Master/Slave Set: WebDataClone [WebData]
>> Masters: [ server2ubuntu1 server4ubuntu1 ]
>> Clone Set: dlm-clone [dlm]
>> Started: [ server2ubuntu1 server4ubuntu1 ]
>> Clone Set: ClusterIP-clone [ClusterIP] (unique)
>> ClusterIP:0 (ocf::heartbeat:IPaddr2): Started
>> server2ubuntu1
>> ClusterIP:1 (ocf::heartbeat:IPaddr2): Started
>> server4ubuntu1
>> Clone Set: WebFS-clone [WebFS]
>> Started: [ server4ubuntu1 ]
>> Stopped: [ server2ubuntu1 ]
>> Clone Set: WebSite-clone [WebSite]
>> Started: [ server4ubuntu1 ]
>> Stopped: [ server2ubuntu1 ]
>>
>> Failed Actions:
>> * stonith_id_1_start_0 on server2ubuntu1 'unknown error' (1):
>> call=201,
>> status=Error, exitreason='',
>> last-rc-change='Thu Jul 2 14:37:35 2020', queued=0ms,
>> exec=3403ms
>> * r0_pacemaker_monitor_60000 on server2ubuntu1 'master' (8):
>> call=203,
>> status=complete, exitreason='',
>> last-rc-change='Thu Jul 2 14:38:39 2020', queued=0ms, exec=0ms
>> * stonith_id_1_start_0 on server4ubuntu1 'unknown error' (1):
>> call=202,
>> status=Error, exitreason='',
>> last-rc-change='Thu Jul 2 14:37:39 2020', queued=0ms,
>> exec=3411ms
>>
>>
>> The stonith resoursce is stopped and does not seem to work.
>> On both hosts the command
>> # fence_xvm -o list
>> kvm102 bab3749c-15fc-40b7-8b6c-d4267b9f0eb9
>> on
>
> This should show both VMs, so getting to that point will likely solve
> your problem. fence_xvm relies on multicast, there could be some
> obscure network configuration to get that working on the VMs.
>
>> returns the local VM. Apparently it connects through the
>> Virtualization
>> interface because it returns the VM name not the Hostname of the
>> client
>> VM. I do not know if this is how it is supposed to work?
>
> Yes, fence_xvm knows only about the VM names.
>
> To get pacemaker to be able to use it for fencing the cluster nodes,
> you have to add a pcmk_host_map parameter to the fencing resource. It
> looks like pcmk_host_map="nodename1:vmname1;nodename2:vmname2;..."
>
>> In the local network, every traffic is allowed. No firewall is
>> locally
>> active, just the connections leaving the local network are
>> firewalled.
>> Hence there are no coneection problems between the hosts and clients.
>> For example we can succesfully connect from the clients to the Hosts:
>>
>> # nc -z -v -u 192.168.1.21 1229
>> Ncat: Version 7.50 ( https://nmap.org/ncat )
>> Ncat: Connected to 192.168.1.21:1229.
>> Ncat: UDP packet sent successfully
>> Ncat: 1 bytes sent, 0 bytes received in 2.03 seconds.
>>
>> # nc -z -v -u 192.168.1.13 1229
>> Ncat: Version 7.50 ( https://nmap.org/ncat )
>> Ncat: Connected to 192.168.1.13:1229.
>> Ncat: UDP packet sent successfully
>> Ncat: 1 bytes sent, 0 bytes received in 2.03 seconds.
>>
>>
>> On the Ubuntu VMs we created and configured the the stonith resource
>> according to the howto provided here:
>> https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/pdf/Clusters_from_Scratch/Pacemaker-1.1-Clusters_from_Scratch-en-US.pdf
>>
>> The actual line we used:
>> # pcs -f stonith_cfg stonith create stonith_id_1 external/libvirt
>> hostlist="Host4,host2"
>> hypervisor_uri="qemu+ssh://192.168.1.21/system"
>>
>>
>> But as you can see in in the pcs status output, stonith is stopped
>> and
>> exits with an unkown error.
>>
>> Can somebody please advise on how to procced or what additionla
>> information is needed to solve this problem?
>> Any help would be greatly appreciated! Thank you in advance.
>>
>> Kind regards
>> Stefan Schmitz
>>
>>
>>
>>
>>
>>
>>
>>
More information about the Users
mailing list