[ClusterLabs] Antw: Re: DRBD failover in Pacemaker

Fri Sep 9 02:56:00 EDT 2016

>>> Dimitri Maziuk <dmaziuk at bmrb.wisc.edu> schrieb am 09.09.2016 um 02:17 in
Nachricht <72d90bbe-1eb8-f2d8-e7d4-43e0a19b6cbf at bmrb.wisc.edu>:
> On 09/08/2016 06:33 PM, Digimer wrote:
> 
>> With 'fencing resource-and-stonith;' and a {un,}fence-handler set, DRBD
>> will block when the peer is lost until the fence handler script returns
>> indicating the peer was fenced/stonithed. In this way, the secondary
>> WON'T promote to Primary while the peer is still Primary. It will only
>> promote AFTER confirmation that the old Primary is gone. Thus, no
>> split-brain.
> 
> In 7 or 8 years of running several DRBD pairs I had split brain about 5
> times and at least 2 of them were because I tugged on the crosslink
> cable while mucking around the back of the rack. Maybe if you run a
> zillion of stacked active-active resources on a 100-node cluster DRBD
> split brain becomes a real problem, from where I'm sitting stonith'ing
> DRBD nodes is a solution in search of a problem.

I think the problem is that people using DRBD don't have shared storage, thus cannot use SBD for fencing (i.e. they use (just as DRBD does) network-based fencing). SO if the network fails, both things fail: DRBD sync and fencing. I think a working solution should have highly-available independent channels for DRBD and fencing.

Regards,
Ulrich

> 
> -- 
> Dimitri Maziuk
> Programmer/sysadmin
> BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu