[Pacemaker] Two-node cluster fencing

Tue Oct 22 09:52:05 EDT 2013

Am 2013-10-21 10:40, schrieb Michael Schwartzkopff:
> Am Montag, 21. Oktober 2013, 10:28:53 schrieb Timm Bordeman:
>> Hi,
>>
>> I'm building a two-node cluster based on XenServer, Pacemaker and 
>> DRBD. All
>> needed resources are configured and correctly handled by pacemaker, 
>> but
>> currently I'm struggling with stonith / fencing.
>>
>> Both physical servers are running XenServer and a couple of virtual 
>> machines
>> which are being mirrored. For example, on each servers is an 
>> Apache-VM
>> running which share a data partition over DRBD. I configured fencing 
>> over
>> XEN, which is restarting any faulty VM reliable, as long as both 
>> physical
>> servers are working correctly.
>>
>> Unfortunately fencing doesn't work when a server that hosts a faulty 
>> virtual
>> machine is powered off or not reachable over the network. In this 
>> case
>> pacemaker does not promote the DRBD partition on the second / 
>> passive
>> virtual machine to the primary partition. Other resources, like the 
>> apache
>> server, won't get started. I know that this is an expected behaviour 
>> of
>> Pacemaker and DRBD, but I'm not sure what is needed to make the 
>> failover
>> reliable even in the case of a completely broken physical server. 
>> Fencing
>> by issuing a reboot of the broken server obviously is not an option 
>> since
>> the server wouldn't come up due to a hardware defect.
>>
>> I appreciate any help on this.
>>
>> Thanks,
>> Tim

Hi,

I'm sorry for the delay.

> You considered that quorum does not work in a two-node cluster 
> (option no-
> quorum-policy="ignore")?

Yes, I did (see [1]).

> The other possibility is that fencing does not reach the other server 
> to run
> its commands successful.

Exactly. The agent fence_xenapi tries to fence the virtual machine, but 
cannot connect to the physical host. It ends up with a "no route to 
host" (see [6])

> Please check the logs and give more detail on your
> setup. What do you want to acchieve? Konfig? Logs?

Well, if a virtual machine cannot be fenced, I want the passive node 
(after some retries or a delay) to become the primary one. I know that 
this might lead to a split-brain under some circumstances, but I'm quite 
confident that those situations happen very rarely and need to be 
handled manually by an administrator.

Tim

[1] cib dump: http://pastebin.com/QWEJjJSZ
[2] corosync.conf: http://pastebin.com/zaQjDgPA
[3] r0.conf: http://pastebin.com/M6FnAfHu
[4] r1.conf: http://pastebin.com/SHd2Jdq7
[5] corosync.log (excerpt): http://pastebin.com/QHkUeNh1
[6] syslog (excerpt): http://pastebin.com/Zfd56mCE