[Pacemaker] clmvd hangs on node1 if node2 is fenced

Tim Serong tserong at novell.com
Fri Aug 27 05:40:02 EDT 2010


On 8/27/2010 at 03:37 PM, Michael Smith <msmith at cbnco.com> wrote: 
> On Thu, 26 Aug 2010, Tim Serong wrote: 
>  
> > > for now I have stonith-enabled="false" in   
> > > my CIB. Is there a way to make clvmd/dlm respect it?  
> >  
> > No.  At least, I don't think so, and/or I hope not :) 
>  
> I think I'd consider it a bug: I've disabled stonith, so dlm shouldn't  
> wait forever for a fence operation that isn't going to happen. 
>  
> CLVM is just making the metadata cluster-aware, so the only way I can  
> imagine screwing things up without fencing would be if I ran something  
> like lvresize on two nodes at the same time, during a split brain. 

So I dug around a little:

  # dlm_controld.pcmk -h 
  Usage:
    dlm_controld [options]
  Options:
    ...
    -f <num>      Enable (1) or disable (0) fencing recovery dependency
                  Default is 1
    -q <num>      Enable (1) or disable (0) quorum recovery dependency
                  Default is 0

I reckon if you set the args parameter of your ocf:pacemaker:controld
resource to "-f 0 -q 0", you'll have DLM ignoring fencing.  At this
point (lest someone reading the archives later thinks I am advocating
this) it would be irresponsible of me not to mention this story about
Why You Need STONITH:

  http://advogato.org/person/lmb/diary/105.html

There is also an accompanying comic:

  http://ourobengr.com/stonith-story

If DLM is ignoring fencing, everything that uses DLM is also going to
ignore fencing, so if you've got (say) an OCFS2 filesystem on top of
your CLVM volume, your filesystem will potentially be toast in a
split-brain situation.

Regards,

Tim


-- 
Tim Serong <tserong at novell.com>
Senior Clustering Engineer, OPS Engineering, Novell Inc.







More information about the Pacemaker mailing list