[ClusterLabs] Help required for N+1 redundancy setup

Thu Dec 3 06:23:29 EST 2015

Ken,

One more question, if i have to propagate configuration changes between the
nodes then is cpg (closed process group) the right way?
For e.g.
Active Node1 has config A=1, B=2
Active Node2 has config A=3, B=4
Standby Node needs to have configuration for all the nodes such that
whichever goes down, it comes up with those values.
Here configuration is not static but can be updated at run-time.
BTW, I'm little confused between OpenAIS and Corosync. For my purpose I
should be able to use either, right?
Thanks.

On Tue, Dec 1, 2015 at 9:04 PM, Ken Gaillot <kgaillot at redhat.com> wrote:

> On 12/01/2015 05:31 AM, Nikhil Utane wrote:
> > Hi,
> >
> > I am evaluating whether it is feasible to use Pacemaker + Corosync to add
> > support for clustering/redundancy into our product.
>
> Most definitely
>
> > Our objectives:
> > 1) Support N+1 redundancy. i,e. N Active and (up to) 1 Standby.
>
> You can do this with location constraints and scores. See:
>
> http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#_deciding_which_nodes_a_resource_can_run_on
>
> Basically, you give the standby node a lower score than the other nodes.
>
> > 2) Each node has some different configuration parameters.
> > 3) Whenever any active node goes down, the standby node comes up with the
> > same configuration that the active had.
>
> How you solve this requirement depends on the specifics of your
> situation. Ideally, you can use OCF resource agents that take the
> configuration location as a parameter. You may have to write your own,
> if none is available for your services.
>
> > 4) There is no one single process/service for which we need redundancy,
> > rather it is the entire system (multiple processes running together).
>
> This is trivially implemented using either groups or ordering and
> colocation constraints.
>
> Order constraint = start service A before starting service B (and stop
> in reverse order)
>
> Colocation constraint = keep services A and B on the same node
>
> Group = shortcut to specify several services that need to start/stop in
> order and be kept together
>
>
> http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#idm231363875392
>
>
> http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#group-resources
>
>
> > 5) I would also want to be notified when any active<->standby state
> > transition happens as I would want to take some steps at the application
> > level.
>
> There are multiple approaches.
>
> If you don't mind compiling your own packages, the latest master branch
> (which will be part of the upcoming 1.1.14 release) has built-in
> notification capability. See:
> http://blog.clusterlabs.org/blog/2015/reliable-notifications/
>
> Otherwise, you can use SNMP or e-mail if your packages were compiled
> with those options, or you can use the ocf:pacemaker:ClusterMon resource
> agent:
>
> http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#idm231308442928
>
> > I went through the documents/blogs but all had example for 1 active and 1
> > standby use-case and that too for some standard service like httpd.
>
> Pacemaker is incredibly versatile, and the use cases are far too varied
> to cover more than a small subset. Those simple examples show the basic
> building blocks, and can usually point you to the specific features you
> need to investigate further.
>
> > One additional question, If I am having multiple actives, then Virtual IP
> > configuration cannot be used? Is it possible such that N actives have
> > different IP addresses but whenever standby becomes active it uses the IP
> > address of the failed node?
>
> Yes, there are a few approaches here, too.
>
> The simplest is to assign a virtual IP to each active, and include it in
> your group of resources. The whole group will fail over to the standby
> node if the original goes down.
>
> If you want a single virtual IP that is used by all your actives, one
> alternative is to clone the ocf:heartbeat:IPaddr2 resource. When cloned,
> that resource agent will use iptables' CLUSTERIP functionality, which
> relies on multicast Ethernet addresses (not to be confused with
> multicast IP). Since multicast Ethernet has limitations, this is not
> often used in production.
>
> A more complicated method is to use a virtual IP in combination with a
> load-balancer such as haproxy. Pacemaker can manage haproxy and the real
> services, and haproxy manages distributing requests to the real services.
>
> > Thanking in advance.
> > Nikhil
>
> A last word of advice: Fencing (aka STONITH) is important for proper
> recovery from difficult failure conditions. Without it, it is possible
> to have data loss or corruption in a split-brain situation.
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20151203/437e3890/attachment-0003.html>