<div dir="ltr">Thank You Ken for such a detailed response. Truly appreciate it. Cheers.<div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Dec 1, 2015 at 9:04 PM, Ken Gaillot <span dir="ltr"><<a href="mailto:kgaillot@redhat.com" target="_blank">kgaillot@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On 12/01/2015 05:31 AM, Nikhil Utane wrote:<br>

> Hi,<br>

><br>

> I am evaluating whether it is feasible to use Pacemaker + Corosync to add<br>

> support for clustering/redundancy into our product.<br>

<br>

</span>Most definitely<br>

<span class=""><br>

> Our objectives:<br>

> 1) Support N+1 redundancy. i,e. N Active and (up to) 1 Standby.<br>

<br>

</span>You can do this with location constraints and scores. See:<br>

<a href="http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#_deciding_which_nodes_a_resource_can_run_on" rel="noreferrer" target="_blank">http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#_deciding_which_nodes_a_resource_can_run_on</a><br>

<br>

Basically, you give the standby node a lower score than the other nodes.<br>

<span class=""><br>

> 2) Each node has some different configuration parameters.<br>

> 3) Whenever any active node goes down, the standby node comes up with the<br>

> same configuration that the active had.<br>

<br>

</span>How you solve this requirement depends on the specifics of your<br>

situation. Ideally, you can use OCF resource agents that take the<br>

configuration location as a parameter. You may have to write your own,<br>

if none is available for your services.<br>

<span class=""><br>

> 4) There is no one single process/service for which we need redundancy,<br>

> rather it is the entire system (multiple processes running together).<br>

<br>

</span>This is trivially implemented using either groups or ordering and<br>

colocation constraints.<br>

<br>

Order constraint = start service A before starting service B (and stop<br>

in reverse order)<br>

<br>

Colocation constraint = keep services A and B on the same node<br>

<br>

Group = shortcut to specify several services that need to start/stop in<br>

order and be kept together<br>

<br>

<a href="http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#idm231363875392" rel="noreferrer" target="_blank">http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#idm231363875392</a><br>

<br>

<a href="http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#group-resources" rel="noreferrer" target="_blank">http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#group-resources</a><br>

<span class=""><br>

<br>

> 5) I would also want to be notified when any active<->standby state<br>

> transition happens as I would want to take some steps at the application<br>

> level.<br>

<br>

</span>There are multiple approaches.<br>

<br>

If you don't mind compiling your own packages, the latest master branch<br>

(which will be part of the upcoming 1.1.14 release) has built-in<br>

notification capability. See:<br>

<a href="http://blog.clusterlabs.org/blog/2015/reliable-notifications/" rel="noreferrer" target="_blank">http://blog.clusterlabs.org/blog/2015/reliable-notifications/</a><br>

<br>

Otherwise, you can use SNMP or e-mail if your packages were compiled<br>

with those options, or you can use the ocf:pacemaker:ClusterMon resource<br>

agent:<br>

<a href="http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#idm231308442928" rel="noreferrer" target="_blank">http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#idm231308442928</a><br>

<span class=""><br>

> I went through the documents/blogs but all had example for 1 active and 1<br>

> standby use-case and that too for some standard service like httpd.<br>

<br>

</span>Pacemaker is incredibly versatile, and the use cases are far too varied<br>

to cover more than a small subset. Those simple examples show the basic<br>

building blocks, and can usually point you to the specific features you<br>

need to investigate further.<br>

<span class=""><br>

> One additional question, If I am having multiple actives, then Virtual IP<br>

> configuration cannot be used? Is it possible such that N actives have<br>

> different IP addresses but whenever standby becomes active it uses the IP<br>

> address of the failed node?<br>

<br>

</span>Yes, there are a few approaches here, too.<br>

<br>

The simplest is to assign a virtual IP to each active, and include it in<br>

your group of resources. The whole group will fail over to the standby<br>

node if the original goes down.<br>

<br>

If you want a single virtual IP that is used by all your actives, one<br>

alternative is to clone the ocf:heartbeat:IPaddr2 resource. When cloned,<br>

that resource agent will use iptables' CLUSTERIP functionality, which<br>

relies on multicast Ethernet addresses (not to be confused with<br>

multicast IP). Since multicast Ethernet has limitations, this is not<br>

often used in production.<br>

<br>

A more complicated method is to use a virtual IP in combination with a<br>

load-balancer such as haproxy. Pacemaker can manage haproxy and the real<br>

services, and haproxy manages distributing requests to the real services.<br>

<br>

> Thanking in advance.<br>

> Nikhil<br>

<br>

A last word of advice: Fencing (aka STONITH) is important for proper<br>

recovery from difficult failure conditions. Without it, it is possible<br>

to have data loss or corruption in a split-brain situation.<br>

<br>

_______________________________________________<br>

Users mailing list: <a href="mailto:Users@clusterlabs.org">Users@clusterlabs.org</a><br>

<a href="http://clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://clusterlabs.org/mailman/listinfo/users</a><br>

<br>

Project Home: <a href="http://www.clusterlabs.org" rel="noreferrer" target="_blank">http://www.clusterlabs.org</a><br>

Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" rel="noreferrer" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>

Bugs: <a href="http://bugs.clusterlabs.org" rel="noreferrer" target="_blank">http://bugs.clusterlabs.org</a><br>

</blockquote></div><br></div>