[ClusterLabs] Help required for N+1 redundancy setup

Tue Dec 22 15:59:59 UTC 2015

On 12/22/2015 12:17 AM, Nikhil Utane wrote:
> I have prepared a write-up explaining my requirements and current solution
> that I am proposing based on my understanding so far.
> Kindly let me know if what I am proposing is good or there is a better way
> to achieve the same.
> 
> https://drive.google.com/file/d/0B0zPvL-Tp-JSTEJpcUFTanhsNzQ/view?usp=sharing
> 
> Let me know if you face any issue in accessing the above link. Thanks.

This looks great. Very well thought-out.

One comment:

"8. In the event of any failover, the standby node will get notified
through an event and it will execute a script that will read the
configuration specific to the node that went down (again using
crm_attribute) and become active."

It may not be necessary to use the notifications for this. Pacemaker
will call your resource agent with the "start" action on the standby
node, after ensuring it is stopped on the previous node. Hopefully the
resource agent's start action has (or can have, with configuration
options) all the information you need.

If you do end up needing notifications, be aware that the feature will
be disabled by default in the 1.1.14 release, because changes in syntax
are expected in further development. You can define a compile-time
constant to enable them.

> On Thu, Dec 3, 2015 at 11:34 PM, Ken Gaillot <kgaillot at redhat.com> wrote:
> 
>> On 12/03/2015 05:23 AM, Nikhil Utane wrote:
>>> Ken,
>>>
>>> One more question, if i have to propagate configuration changes between
>> the
>>> nodes then is cpg (closed process group) the right way?
>>> For e.g.
>>> Active Node1 has config A=1, B=2
>>> Active Node2 has config A=3, B=4
>>> Standby Node needs to have configuration for all the nodes such that
>>> whichever goes down, it comes up with those values.
>>> Here configuration is not static but can be updated at run-time.
>>
>> Being unfamiliar with the specifics of your case, I can't say what the
>> best approach is, but it sounds like you will need to write a custom OCF
>> resource agent to manage your service.
>>
>> A resource agent is similar to an init script:
>>
>> http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#ap-ocf
>>
>> The RA will start the service with the appropriate configuration. It can
>> use per-resource options configured in pacemaker or external information
>> to do that.
>>
>> How does your service get its configuration currently?
>>
>>> BTW, I'm little confused between OpenAIS and Corosync. For my purpose I
>>> should be able to use either, right?
>>
>> Corosync started out as a subset of OpenAIS, optimized for use with
>> Pacemaker. Corosync 2 is now the preferred membership layer for
>> Pacemaker for most uses, though other layers are still supported.
>>
>>> Thanks.
>>>
>>> On Tue, Dec 1, 2015 at 9:04 PM, Ken Gaillot <kgaillot at redhat.com> wrote:
>>>
>>>> On 12/01/2015 05:31 AM, Nikhil Utane wrote:
>>>>> Hi,
>>>>>
>>>>> I am evaluating whether it is feasible to use Pacemaker + Corosync to
>> add
>>>>> support for clustering/redundancy into our product.
>>>>
>>>> Most definitely
>>>>
>>>>> Our objectives:
>>>>> 1) Support N+1 redundancy. i,e. N Active and (up to) 1 Standby.
>>>>
>>>> You can do this with location constraints and scores. See:
>>>>
>>>>
>> http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#_deciding_which_nodes_a_resource_can_run_on
>>>>
>>>> Basically, you give the standby node a lower score than the other nodes.
>>>>
>>>>> 2) Each node has some different configuration parameters.
>>>>> 3) Whenever any active node goes down, the standby node comes up with
>> the
>>>>> same configuration that the active had.
>>>>
>>>> How you solve this requirement depends on the specifics of your
>>>> situation. Ideally, you can use OCF resource agents that take the
>>>> configuration location as a parameter. You may have to write your own,
>>>> if none is available for your services.
>>>>
>>>>> 4) There is no one single process/service for which we need redundancy,
>>>>> rather it is the entire system (multiple processes running together).
>>>>
>>>> This is trivially implemented using either groups or ordering and
>>>> colocation constraints.
>>>>
>>>> Order constraint = start service A before starting service B (and stop
>>>> in reverse order)
>>>>
>>>> Colocation constraint = keep services A and B on the same node
>>>>
>>>> Group = shortcut to specify several services that need to start/stop in
>>>> order and be kept together
>>>>
>>>>
>>>>
>> http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#idm231363875392
>>>>
>>>>
>>>>
>> http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#group-resources
>>>>
>>>>
>>>>> 5) I would also want to be notified when any active<->standby state
>>>>> transition happens as I would want to take some steps at the
>> application
>>>>> level.
>>>>
>>>> There are multiple approaches.
>>>>
>>>> If you don't mind compiling your own packages, the latest master branch
>>>> (which will be part of the upcoming 1.1.14 release) has built-in
>>>> notification capability. See:
>>>> http://blog.clusterlabs.org/blog/2015/reliable-notifications/
>>>>
>>>> Otherwise, you can use SNMP or e-mail if your packages were compiled
>>>> with those options, or you can use the ocf:pacemaker:ClusterMon resource
>>>> agent:
>>>>
>>>>
>> http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#idm231308442928
>>>>
>>>>> I went through the documents/blogs but all had example for 1 active
>> and 1
>>>>> standby use-case and that too for some standard service like httpd.
>>>>
>>>> Pacemaker is incredibly versatile, and the use cases are far too varied
>>>> to cover more than a small subset. Those simple examples show the basic
>>>> building blocks, and can usually point you to the specific features you
>>>> need to investigate further.
>>>>
>>>>> One additional question, If I am having multiple actives, then Virtual
>> IP
>>>>> configuration cannot be used? Is it possible such that N actives have
>>>>> different IP addresses but whenever standby becomes active it uses the
>> IP
>>>>> address of the failed node?
>>>>
>>>> Yes, there are a few approaches here, too.
>>>>
>>>> The simplest is to assign a virtual IP to each active, and include it in
>>>> your group of resources. The whole group will fail over to the standby
>>>> node if the original goes down.
>>>>
>>>> If you want a single virtual IP that is used by all your actives, one
>>>> alternative is to clone the ocf:heartbeat:IPaddr2 resource. When cloned,
>>>> that resource agent will use iptables' CLUSTERIP functionality, which
>>>> relies on multicast Ethernet addresses (not to be confused with
>>>> multicast IP). Since multicast Ethernet has limitations, this is not
>>>> often used in production.
>>>>
>>>> A more complicated method is to use a virtual IP in combination with a
>>>> load-balancer such as haproxy. Pacemaker can manage haproxy and the real
>>>> services, and haproxy manages distributing requests to the real
>> services.
>>>>
>>>>> Thanking in advance.
>>>>> Nikhil
>>>>
>>>> A last word of advice: Fencing (aka STONITH) is important for proper
>>>> recovery from difficult failure conditions. Without it, it is possible
>>>> to have data loss or corruption in a split-brain situation.
>>
>>
>