[Pacemaker] Two node cluster and no hardware device for stonith.

Digimer lists at alteeve.ca
Wed Jan 21 11:18:05 EST 2015

On 21/01/15 08:13 AM, Andrea wrote:
> Hi All,
> I have a question about stonith
> In my scenarion , I have to create 2 node cluster, but I don't have any
> hardware device for stonith. No APC no IPMI ecc, no one of the list returned
> by "pcs stonith list"
> So, there is an option to do something?
> This is my scenario:
> - 2 nodes cluster
> serverHA1
> serverHA2
> - Software
> Centos 6.6
> pacemaker.x86_64  1.1.12-4.el6
> cman.x86_64
> corosync.x86_64   1.4.7-1.el6
> -NO hardware device for stonith!
> - Cluster creation ([ALL] operation done on all nodes, [ONE] operation done
> on only one node)
> [ALL] systemctl start pcsd.service
> [ALL] systemctl enable pcsd.service
> [ONE] pcs cluster auth serverHA1 serverHA2
> [ALL] echo "CMAN_QUORUM_TIMEOUT=0" >> /etc/sysconfig/cman
> [ONE] pcs cluster setup --name MyCluHA serverHA1 serverHA2
> [ONE] pcs property set stonith-enabled=false
> [ONE] pcs property set no-quorum-policy=ignore
> [ONE] pcs resource create ping ocf:pacemaker:ping dampen=5s multiplier=1000
> host_list= --clone
> In my test, when I simulate network failure, split brain occurs, and when
> network come back, One node kill the other node
> -log on node 1:
> Jan 21 11:45:28 corosync [CMAN  ] memb: Sending KILL to node 2
> -log on node 2:
> Jan 21 11:45:28 corosync [CMAN  ] memb: got KILL for node 2
> There is a method to restart pacemaker when network come back instead of
> kill it?
> Thanks
> Andrea

You really need a fence device, there isn't a way around it. By 
definition, when a node needs to be fenced, it is in an unknown state 
and it can not be predicted to operate predictably.

If you're using real hardware, then you can use a switched PDU 
(network-connected power bar with individual outlet control) to do 
fencing. I use the APC AP7900 in all my clusters and it works perfectly. 
I know that some other brands work, too.

If you machines are virtual machines, then you can do fencing by talking 
to the hypervisor. In this case, one node calls the host of the other 
node and asks it to be terminated (fence_virsh and fence_xvm for KVM/Xen 
systems, fence_vmware for VMWare, etc).

Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without 
access to education?

More information about the Pacemaker mailing list