[ClusterLabs] Cannot ping a secondary address apart from the server which it is assigned to (on Azure)

Kyle O'Donnell kyleo at 0b10.mx
Thu Oct 28 09:36:50 EDT 2021


Could you use vxlan to create an overlay network, then use a floating ip managed by the cluster on the overlay network, using that as a dependency for service managing the floating ip from azure? I haven't fully thought through this and it might be a tad hacky, but it feels like it should work.

I wrote something a while back to use vxlan to create a layer2 in gcp and aws: https://github.com/TassatGroup/manage_vxlan

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Thursday, October 28th, 2021 at 08:43, Paul Warwicker <paul.warwicker at gmail.com> wrote:

> Hello,
>
> I originally posted this in the Azure forums first but have had no replies. Trying here instead in case anyone has encountered it.
>
> I am trying to setup up a High Availability Cluster in Azure using CentOS 8, Pacemaker and Corosync. Everything is deployed using terraform.
>
> For our application, we need to migrate a floating IP address, a shared storage and our daemon between nodes. These resources are grouped into a service and these are successfully migrating between nodes as required. We are also using a private DNS zone and there is no firewall on either server. There is a DNS entry for the floating IP and that is resolvable by both servers and client.
>
> The problem is that floating IP address is only pingable on the server which has the floating IP address assigned as a secondary address. All other nodes in the same subnet will get the error Destination Host Unreachable, but pings to the primary address will succeed. All the IP addresses are in the same subnet (172.16.31.0/24). Auto-registration is enabled for the servers and client which makes up the test environment. The floating address was a somewhat arbitrary choice, but remains in that same subnet and would not be otherwise allocated. I mentioned the auto registration because the floating IP is not auto-registered.
>
> If I migrate the service to the other server node, the roles are reversed, the server which could not ping the address can now do so and the server which could, cannot.
>
> Any insight would be welcome.
>
> Additional detail considering the audience:
>
> pcs host auth -u hacluster -p ******** haswmfs-lin-vm-000 haswmfs-lin-vm-001
> pcs cluster setup haswmfs haswmfs-lin-vm-000 haswmfs-lin-vm-001
> pcs cluster enable --all
> pcs cluster start --all
> sleep 30
> pcs property set stonith-enabled=false
> pcs resource create haswmfs-fs ocf:heartbeat:Filesystem device=/dev/sdc directory=/mnt/smallworld fstype=xfs
> pcs resource create haswmfs-daemon lsb:smallworld_GIS
> pcs resource create haswmfs-ip ocf:heartbeat:IPaddr2 ip=${var.virtual_ip} cidr_netmask=24 nic=eth0 iflabel=haswmfs op monitor interval=30s
> pcs resource group add haswmfs-service haswmfs-ip haswmfs-daemon haswmfs-fs
> fence_azure_arm -l ${var.app_id} -p ${data.external.service_principal.result.password} --resourceGroup ${var.resource_group_name} --tenantId ${data.azurerm_client_config.current.tenant_id} --subscriptionId ${data.azurerm_client_config.current.subscription_id} -o list
> pcs stonith create haswmfs-fence fence_azure_arm login=${var.app_id} passwd=${data.external.service_principal.result.password} resourceGroup=${var.resource_group_name} tenantId=${data.azurerm_client_config.current.tenant_id} subscriptionId=${data.azurerm_client_config.current.subscription_id} pcmk_host_list="haswmfs-lin-vm-000 haswmfs-lin-vm-001" power_timeout=240 pcmk_reboot_timeout=480 pcmk_reboot_retries=4 pcmk_reboot_action=reboot # op monitor interval=60s
> pcs property set stonith-enabled=true
> pcs property config --all | egrep "stonith|quorum"
> sleep 60
> pcs cluster verify --full # should be blank if ok
> crm_verify -LV # should be blank if ok
> pcs cluster config
> pcs resource config haswmfs-service
> pcs stonith config
> pcs status
>
> As mentioned in the original post, everything is failing over as expected and the only issue is the pinging of the virtual IP.
>
> Thanks
>
> -paul
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20211028/f50db4ea/attachment-0001.htm>


More information about the Users mailing list