[Pacemaker] How to prevent locked I/O using Pacemaker with Primary/Primary DRBD/OCFS2 (Ubuntu 10.10)

Mike Reid mbreid at thepei.com
Thu Apr 7 11:08:14 EDT 2011


Lars,

Interesting, I will definitely continue in that direction then. Perhaps I
misunderstood the requirements of STONITH. I understand it to be a form of
³remote reboot/shut down² of sorts, and being that the box was already ³shut
down², I assumed at this stage in my testing that it could not be related to
STONITH since the box was confirmed to be down. Perhaps Pacemaker is just
awaiting that confirmation as you suggest, so thank you, I will see if that
is indeed the case. I¹ve seen quite a few stonith operation options
available, is there any one of them that is better suited for a simple
two-node cluster (OCFS2)?


> Message: 1
> Date: Thu, 7 Apr 2011 02:50:09 +0200
> From: Lars Ellenberg <lars.ellenberg at linbit.com>
> To: pacemaker at oss.clusterlabs.org
> Subject: Re: [Pacemaker] How to prevent locked I/O using Pacemaker
> with Primary/Primary DRBD/OCFS2 (Ubuntu 10.10)
> Message-ID: <20110407005009.GF3726 at barkeeper1-xen.linbit>
> Content-Type: text/plain; charset=iso-8859-1
> 
> On Wed, Apr 06, 2011 at 10:26:24AM -0600, Reid, Mike wrote:
>> Lars,
>> 
>> Thank you for your comments. I did confirm I was running 8.3.8.1, and I have
>> even upgraded to 8.3.10 but am still experiencing the same I/O lock issue. I
>> definitely agree with you, DRBD is behaving exactly as instructed, being
>> properly fenced, etc.
>> 
>> I am quite new to DRBD (and OCFS2), learning a lot as I go. To your
>> question regarding copy/paste, yes, the configuration used was
>> culminated from a series of different tutorials, plus personal trial
>> and error related to this project. I have tried many variations of the
>> DRBD config (including resource-and-stonith)
> 
>> but have not actually set up a functioning STONITH yet,
> 
> And that's why your ocfs2 does not unblock.
> It waits for confirmation of a STONITH operation.
> 
>> hence the
>> "resource-only". The  Linbit
>> docs have been an amazing resource.
>> 
>> Yes, I realize that a Secondary-node is not indicative of it's
>> data/synch state. The options I am testing here were referenced from
>> this page:
>> 
>> 
>> 
>> http://www.drbd.org/users-guide/s-ocfs2-create-resource.html
>> http://www.drbd.org/users-guide/s-configure-split-brain-behavior.html#s-autom
>> atic-split-brain-recovery-configuration
>> 
>> 
>> 
>> When you say "You do configure automatic data loss here", are you
>> suggesting that I am instructing DRBD survivor to perform a full
>> re-synch to it's peer?
> 
> Nothing to do with full sync. Should usually be a bitmap based resync.
> 
> But it may be a sync in an "unexpected" direction.
> 
>> If so, that would make sense since I believe
>> this behavior was something I experienced prior to getting fencing
>> fully established. In my hard-boot testing, I did once notice the
>> "victim" was completely resynching, which sounds related to
>> "after-sb-1pri discard-secondary".
>> 
>> DRBD aside, have you used OCFS2? I'm failing to realize why if DRBD is
>> fencing it's peer that OCFS2 remains in a locked-state, unable to run
>> standalone? To me, this issue does not seem related to DRBD or Pacemaker, but
>> rather a lower-level requirement of OCFS2 (DLM?), etc.
>> 
>> To date, the ONLY way I can restore I/O to the remaining node is to bring the
>> other node back online, which unfortunately won't work in our Production
>> environment. On a separate ML, someone made a suggestion that "qdisk" might
>> be required to make this work, and while I have tried "qdisk", my high-level
>> research leads me to believe that is a legacy approach, not an option with
>> Pacemaker.  Is that correct?
> 
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> 
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: 
>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> 
> 
> -- 
> : Lars Ellenberg
> : LINBIT | Your Way to High Availability
> : DRBD/HA support and consulting http://www.linbit.com
> 
> DRBD? and LINBIT? are registered trademarks of LINBIT, Austria.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20110407/c47819fd/attachment-0003.html>


More information about the Pacemaker mailing list