[Pacemaker] configuration variants for 2 node cluster

Digimer lists at alteeve.ca
Mon Jun 23 09:34:25 EDT 2014

On 23/06/14 09:11 AM, Kostiantyn Ponomarenko wrote:
> Hi guys,
> I want to gather all possible configuration variants for 2-node cluster,
> because it has a lot of pitfalls and there are not a lot of information
> across the internet about it. And also I have some questions about
> configurations and their specific problems.
> -----------------
> We can use "two_node" and "wait_for_all" option from Corosync's
> votequorum, and set up fencing agents with delay on one of them.
> Here is a workflow(diagram) of this configuration:
> 1. Node start.
> 2. Cluster start (Corosync and Pacemaker) at the boot time.
> 3. Wait for all nodes. All nodes joined?
>      No. Go to step 3.
>      Yes. Go to step 4.
> 4. Start resources.
> 5. Split brain situation (something with connection between nodes).
> 6. Fencing agent on the one of the nodes reboots the other node (there
> is a configured delay on one of the Fencing agents).
> 7. Rebooted node go to step 1.
> There are two (or more?) important things in this configuration:
> 1. Rebooted node remains waiting for all nodes to be visible (connection
> should be restored).
> 2. Suppose connection problem still exists and the node which rebooted
> the other guy has to be rebooted also (for some reasons). After reboot
> he is also stuck on step 3 because of connection problem.
> -----------------
> Is it possible somehow to assign to the guy who won the reboot race
> (rebooted other guy) a status like a "primary" and allow him not to wait
> for all nodes after reboot. And neglect this status after other node
> joined this one.
> So is it possible?
> Right now that's the only configuration I know for 2 node cluster.
> Other variants are very appreciated =)
> VARIANT 2 (not implemented, just a suggestion):
> -----------------
> I've been thinking about using external SSD drive (or other external
> drive). So for example fencing agent can reserve SSD using SCSI command
> and after that reboot the other node.
> The main idea of this is the first node, as soon as a cluster starts on
> it, reserves SSD till the other node joins the cluster, after that SCSI
> reservation is removed.
> 1. Node start
> 2. Cluster start (Corosync and Pacemaker) at the boot time.
> 3. Reserve SSD. Did it manage to reserve?
>      No. Don't start resources (Wait for all).
>      Yes. Go to step 4.
> 4. Start resources.
> 5. Remove SCSI reservation when the other node has joined.
> 5. Split brain situation (something with connection between nodes).
> 6. Fencing agent tries to reserve SSD. Did it manage to reserve?
>      No. Maybe puts node in standby mode ...
>      Yes. Reboot the other node.
> 7. Optional: a single node can keep SSD reservation till he is alone in
> the cluster or till his shut-down.
> I am really looking forward to find the best solution (or a couple of
> them =)).
> Hope I am not the only person ho is interested in this topic.
> Thank you,
> Kostya

Hi Kostya,

   I only build 2-node clusters, and I've not had problems with this 
going back to 2009 over dozens of clusters. The tricks I found are:

* Disable quorum (of course)
* Setup good fencing, and add a delay to the node you you prefer (or 
pick one at random, if equal value) to avoid dual-fences
* Disable to cluster on start up, to prevent fence loops.

   That's it. With this, your 2-node cluster will be just fine.

   As for your question; Once a node is fenced successfully, the 
resource manager (pacemaker) will take over any services lost on the 
fenced node, if that is how you configured it. A node the either 
gracefully leaves or dies/fenced should not interfere with the remaining 

   The problem is when a node vanishes and fencing fails. Then, not 
knowing what the other node might be doing, the only safe option is to 
block, otherwise you risk a split-brain. This is why fencing is so 


Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without 
access to education?

More information about the Pacemaker mailing list