[ClusterLabs] deployment of Pacemaker in GCE

Fri Mar 8 02:59:58 EST 2024

On 07/03/24 02:28 +0000, Ali Shahbazifakhr via Users wrote:
>Hello,
>
>I am reaching out to inquire about the usage of Pacemaker on Google Compute Engine (GCE), specifically in conjunction with Managed Instance Groups (MIG). Our team is currently exploring options for implementing high availability and failover solutions within our infrastructure on GCE, and we believe that Pacemaker may be a viable option for achieving this.
>
>Could you kindly provide some insights into how Pacemaker is utilized within the GCE environment, particularly in scenarios involving Managed Instance Groups? We are interested in understanding the design considerations and best practices for implementing Pacemaker with MIG instances.
>
>Additionally, if there are any documentation resources available that explain the design and implementation of Pacemaker with MIG instances on GCE, we would greatly appreciate it if you could point us in the right direction
I dont have any experience with MIG, but from a quick look it seems
like it can be used to replace and/or autoscale, so I would suggest
not replacing the nodes (as Pacemaker takes care of badly behaving
nodes), and you will have to use "pcs host auth <hostname>" and "pcs
cluster node add <hostname>"/"pcs cluster node remove <hostname>" to
add/remove nodes if you use the autoscale functionality.

You can use fence_gce to fence (reboot) badly behaving nodes:
https://github.com/ClusterLabs/fence-agents/blob/main/agents/gce/fence_gce.py

and the gcp-* agents handle IPs, routes, disks, or load balancer(s):
https://github.com/ClusterLabs/resource-agents/tree/main/heartbeat

There is metadata/desc sections in the code of the agents, so you can
find all the info without having to install the packages.

If you're new to Pacemaker this is a good introduction:
https://www.clusterlabs.org/pacemaker/doc/2.1/Clusters_from_Scratch/singlehtml/

For software that doesnt have a resource agent you can let Pacemaker handle it
via it's systemd or init services/scripts, or make your own agent if
you need e.g. additional monitoring to check that the service is still
alive:
https://github.com/ClusterLabs/resource-agents/blob/main/doc/dev-guides/ra-dev-guide.asc

Oyvind
>
>Looking forward to your response.
>
>
>[CN100]
>Ali Shahbazi
>Specialist Enterprise Architecture | IoT Industrial, Solutions System Engineering |
>T:  | C: 403-702-3093
>What's New at CN<https://www.cn.ca/whatsnew> | Quoi de neuf au CN<https://www.cn.ca/quoi-de-neuf>
>
>

>_______________________________________________
>Manage your subscription:
>https://lists.clusterlabs.org/mailman/listinfo/users
>
>ClusterLabs home: https://www.clusterlabs.org/