[ClusterLabs] [EXTERNE] Re: Centreon HA Cluster - VIP issue

Mon Sep 18 09:46:45 EDT 2023

On Fri, 2023-09-15 at 09:32 +0000, Adil BOUAZZAOUI wrote:
> Hi Ken,
> 
> Any update please?
> 
> The idea is clear; I just need to know more information about this 2
> clusters setup:
> 
> 1. Arbitrator:
> 1.1. Only one arbitrator is needed for everything: should I use the
> Quorum provided by Centreon on the official documentation? Or should
> I use the booth ticket manager instead?

I would use booth for distributed data centers. The Centreon setup is
appropriate for a cluster within a single data center or data centers
on the same campus with a low-latency link.

> 1.2. is fencing configured separately? Or is is configured during the
> booth ticket manager installation?

You'll have to configure fencing in each cluster separately.

> 
> 2. Floating IP:
> 2.1. it doesn't hurt if both Floating IPs are running at the same
> time right?

Correct.

> 
> 3. Fail over:
> 3.1. How to update the DNS to point to the appropriate IP?
> 3.2. we're running our own DNS servers; so How to configure booth
> ticket for just the DNS resource?

You can have more than one ticket. On the Pacemaker side, tickets are
tied to resources with rsc_ticket constraints (though you'll probably
be using a higher-level tool that abstracts that).

How to update the DNS depends on what server you're using -- just
follow its documentation for making changes. You can use the
ocf:pacemaker:Dummy agent as a model and update start to make the DNS
change (in addition to creating the dummy state file). The monitor can
check whether the dummy state file is present and DNS is returning the
desired info. Stop would just remove the dummy state file.

> 4. MariaDB replication:
> 4.1. How can Centreon MariaDB replicat between the 2 clusters?

Native MySQL replication should work fine for that.

> 5. Centreon:
> 5.1. Will this setup (2 clusters, 2 floating IPs, 1 booth manager)
> work for our Centreon project? 

I don't have any experience with that, but it sounds fine.

> 
> 
> 
> Regards
> Adil Bouazzaoui
> 
> 
> Adil BOUAZZAOUI
> Ingénieur Infrastructures & Technologies
> GSM         : +212 703 165 758
> E-mail  : adil.bouazzaoui at tmandis.ma
> 
> 
> -----Message d'origine-----
> De : Adil BOUAZZAOUI 
> Envoyé : Friday, September 8, 2023 5:15 PM
> À : Ken Gaillot <kgaillot at redhat.com>; Adil Bouazzaoui <
> adilb574 at gmail.com>
> Cc : Cluster Labs - All topics related to open-source clustering
> welcomed <users at clusterlabs.org>
> Objet : RE: [EXTERNE] Re: [ClusterLabs] Centreon HA Cluster - VIP
> issue
> 
> Hi Ken,
> 
> Thank you for the update and the clarification.
> The idea is clear; I just need to know more information about this 2
> clusters setup:
> 
> 1. Arbitrator:
> 1.1. Only one arbitrator is needed for everything: should I use the
> Quorum provided by Centreon on the official documentation? Or should
> I use the booth ticket manager instead?
> 1.2. is fencing configured separately? Or is is configured during the
> booth ticket manager installation?
> 
> 2. Floating IP:
> 2.1. it doesn't hurt if both Floating IPs are running at the same
> time right?
> 
> 3. Fail over:
> 3.1. How to update the DNS to point to the appropriate IP?
> 3.2. we're running our own DNS servers; so How to configure booth
> ticket for just the DNS resource?
> 
> 4. MariaDB replication:
> 4.1. How can Centreon MariaDB replicat between the 2 clusters?
> 
> 5. Centreon:
> 5.1. Will this setup (2 clusters, 2 floating IPs, 1 booth manager)
> work for our Centreon project? 
> 
> 
> 
> Regards
> Adil Bouazzaoui
> 
> 
> Adil BOUAZZAOUI
> Ingénieur Infrastructures & Technologies GSM         : +212 703 165
> 758 E-mail  : adil.bouazzaoui at tmandis.ma
> 
> 
> -----Message d'origine-----
> De : Ken Gaillot [mailto:kgaillot at redhat.com] Envoyé : Tuesday,
> September 5, 2023 10:00 PM À : Adil Bouazzaoui <adilb574 at gmail.com>
> Cc : Cluster Labs - All topics related to open-source clustering
> welcomed <users at clusterlabs.org>; Adil BOUAZZAOUI <
> adil.bouazzaoui at tmandis.ma> Objet : [EXTERNE] Re: [ClusterLabs]
> Centreon HA Cluster - VIP issue
> 
> On Tue, 2023-09-05 at 21:13 +0100, Adil Bouazzaoui wrote:
> > Hi Ken,
> > 
> > thank you a big time for the feedback; much appreciated.
> > 
> > I suppose we go with a new Scenario 3: Setup 2 Clusters across 
> > different DCs connected by booth; so could you please clarify
> > below 
> > points to me so i can understand better and start working on the
> > architecture:
> > 
> > 1- in case of separate clusters connected by booth: should each 
> > cluster have a quorum device for the Master/slave elections?
> 
> Hi,
> 
> Only one arbitrator is needed for everything.
> 
> Since each cluster in this case has two nodes, Corosync will use the
> "two_node" configuration to determine quorum. When first starting the
> cluster, both nodes must come up before quorum is obtained. After
> then, only one node is required to keep quorum -- which means that
> fencing is essential to prevent split-brain.
> 
> > 2- separate floating IPs at each cluster: please check the
> > attached 
> > diagram and let me know if this is exactly what you mean?
> 
> Yes, that looks good
> 
> > 3- To fail over, you update the DNS to point to the appropriate IP:
> > can you suggest any guide to work on so we can have the DNS
> > updated 
> > automatically?
> 
> Unfortunately I don't know of any. If your DNS provider offers an API
> of some kind, you can write a resource agent that uses it. If you're
> running your own DNS servers, the agent has to update the zone files
> appropriately and reload.
> 
> Depending on what your services are, it might be sufficient to use a
> booth ticket for just the DNS resource, and let everything else stay
> running all the time. For example it doesn't hurt anything for both
> sites' floating IPs to stay up.
> 
> > Regards
> > Adil Bouazzaoui
> > 
> > Le mar. 5 sept. 2023 à 16:48, Ken Gaillot <kgaillot at redhat.com> a 
> > écrit :
> > > Hi,
> > > 
> > > The scenario you describe is still a challenging one for HA.
> > > 
> > > A single cluster requires low latency and reliable communication.
> > > A 
> > > cluster within a single data center or spanning data centers on
> > > the 
> > > same campus can be reliable (and appears to be what Centreon has
> > > in 
> > > mind), but it sounds like you're looking for geographical 
> > > redundancy.
> > > 
> > > A single cluster isn't appropriate for that. Instead, separate 
> > > clusters connected by booth would be preferable. Each cluster
> > > would 
> > > have its own nodes and fencing. Booth tickets would control
> > > which 
> > > cluster could run resources.
> > > 
> > > Whatever design you use, it is pointless to put a quorum tie- 
> > > breaker at one of the data centers. If that data center becomes 
> > > unreachable, the other one can't recover resources. The tie-
> > > breaker 
> > > (qdevice for a single cluster or a booth arbitrator for multiple
> > > clusters) can be very lightweight, so it can run in a public
> > > cloud 
> > > for example, if a third site is not available.
> > > 
> > > The IP issue is separate. For that, you will need separate
> > > floating 
> > > IPs at each cluster, on that cluster's network. To fail over,
> > > you 
> > > update the DNS to point to the appropriate IP. That is a tricky 
> > > problem without a universal automated solution. Some people
> > > update 
> > > the DNS manually after being alerted of a failover. You could
> > > write 
> > > a custom resource agent to update the DNS automatically. Either
> > > way 
> > > you'll need low TTLs on the relevant records.
> > > 
> > > On Sun, 2023-09-03 at 11:59 +0000, Adil BOUAZZAOUI wrote:
> > > > Hello,
> > > >  
> > > > My name is Adil, I’m working for Tman company, we are testing
> > > > the 
> > > > Centreon HA cluster to monitor our infrastructure for 13
> > > companies,
> > > > for now we are using the 100 IT license to test the platform,
> > > once
> > > > everything is working fine then we can purchase a license
> > > suitable
> > > > for our case.
> > > >  
> > > > We're stuck at scenario 2: setting up Centreon HA Cluster with
> > > Master
> > > > & Slave on a different datacenters.
> > > > For scenario 1: setting up the Cluster with Master & Slave and
> > > VIP
> > > > address on the same network (VLAN) it is working fine.
> > > >  
> > > > Scenario 1: Cluster on Same network (same DC) ==> works fine 
> > > > Master in DC 1 VLAN 1: 172.30.9.230 /24 Slave in DC 1 VLAN 1:
> > > > 172.30.9.231 /24 VIP in DC 1 VLAN 1: 172.30.9.240/24 Quorum in
> > > > DC
> > > > 1 LAN: 192.168.253.230/24 Poller in DC 1 LAN:
> > > > 192.168.253.231/24
> > > >  
> > > > Scenario 2: Cluster on different networks (2 separate DCs
> > > connected
> > > > with VPN) ==> still not working
> > > > Master in DC 1 VLAN 1: 172.30.9.230 /24 Slave in DC 2 VLAN 2: 
> > > > 172.30.10.230 /24
> > > > VIP: example 102.84.30.XXX. We used a public static IP from
> > > > our 
> > > > internet service provider, we thought that using a IP from a
> > > > site 
> > > > network won't work, if the site goes down then the VIP won't
> > > > be 
> > > > reachable!
> > > > Quorum: 192.168.253.230/24
> > > > Poller: 192.168.253.231/24
> > > >  
> > > >  
> > > > Our goal is to have Master & Slave nodes on different sites, so
> > > when
> > > > Site A goes down, we keep monitoring with the slave.
> > > > The problem is that we don't know how to set up the VIP
> > > > address?
> > > Nor
> > > > what kind of VIP address will work? or how can the VIP address
> > > work
> > > > in this scenario? or is there anything else that can replace
> > > > the
> > > VIP
> > > > address to make things work.
> > > > Also, can we use a backup poller? so if the poller 1 on Site A
> > > goes
> > > > down, then the poller 2 on Site B can take the lead?
> > > >  
> > > > we looked everywhere (The watch, youtube, Reddit, Github...),
> > > > and
> > > we
> > > > still couldn't get a workaround!
> > > >  
> > > > the guide we used to deploy the 2 Nodes Cluster: 
> > > > 
> > > https://docs.centreon.com/docs/installation/installation-of-centreon
> > > -ha/overview/
> > > >  
> > > > attached the 2 DCs architecture example, and also most of the 
> > > > required screenshots/config.
> > > >  
> > > >  
> > > > We appreciate your support.
> > > > Thank you in advance.
> > > >  
> > > >  
> > > >  
> > > > Regards
> > > > Adil Bouazzaoui
> > > >  
> > > >        Adil BOUAZZAOUI Ingénieur Infrastructures & Technologies
> > >      
> > > >  GSM         : +212 703 165 758 E-mail  : 
> > > adil.bouazzaoui at tmandis.ma
> > > >  
> > > >  
> > > > _______________________________________________
> > > > Manage your subscription:
> > > > https://lists.clusterlabs.org/mailman/listinfo/users
> > > > 
> > > > ClusterLabs home: https://www.clusterlabs.org/
> --
> Ken Gaillot <kgaillot at redhat.com>
> 
-- 
Ken Gaillot <kgaillot at redhat.com>