[ClusterLabs] Antw: Re: Antw: Any CLVM/DLM users around?

Patrick Whitney pwhitney at luminoso.com
Wed Oct 3 11:19:56 EDT 2018


HI Ulrich,

It's not that it is not working for me, it is that to make it, at least
appear, to work for me, I've had to modify dlm.conf -- which I have found
zero mention of this being necessary in any of the tutorials or walk
throughs I've read.  I was curious what ramifications I would encounter by
setting 'enable_fencing=0' in dlm.conf.

>From what I can tell, my config is sane with regard to interleave, colo,
and ordering.

SBD is a possibility we're considering, but haven't fully embraced, as of
yet.

I'm still planning on testing 'enable_startup_fencing=0' instead of
'enable_fencing=0' in dlm.conf and will report back (in case anyone is
interested).

Best,
-Pat

On Tue, Oct 2, 2018 at 2:25 AM Ulrich Windl <
Ulrich.Windl at rz.uni-regensburg.de> wrote:

> Hi!
>
> I'm sorry that DLM/cLVM does not work for you. Did you double-check the
> configuration (meta interleave=true, colocation and ordering), especially
> the clones?
> Also as you have shared storage, why don't you use SBD for fencing?
>
> Regards,
> Ulrich
>
>
> >>> Patrick Whitney <pwhitney at luminoso.com> schrieb am 01.10.2018 um
> 22:01 in
> Nachricht
> <CAE0zLk_Va6gtHz9tG3woEcua2RidaehaOQ8ieQdZ4MeOkcy0nQ at mail.gmail.com>:
> > Hi Ulrich,
> >
> > When I first encountered this issue, I posted this:
> >
> > https://lists.clusterlabs.org/pipermail/users/2018-September/015637.html
> >
> > ... I was using resource fencing in this example, but, as I've mentioned
> > before, the issue would come about, not when fencing occurred, but when
> the
> > fenced node was shutdown (we were using resource fencing).
> >
> > During that discussion, yourself and others suggested that power fencing
> > was the only way DLM was going to cooperate and one suggestion of using
> > meatware was proposed.
> >
> > Unfortunately, I found out later that meatware was no longer available (
> > https://lists.clusterlabs.org/pipermail/users/2018-September/015715.html
> ),
> > so we were lucky enough our test environment is a KVM/libvirt
> environment,
> > so I used fence_virsh.  Again, I had the same problem... when the "bad"
> > node was fenced, dlm_controld would issue (what appears to be) a
> fence_all,
> > and I would receive messages that that the dlm clone was down on all
> > members and would have a log message that the clvm lockspace was
> > abandoned.
> >
> > It was only when I disabled fencing for dlm (enable_fencing=0 in
> dlm.conf;
> > but kept fencing enabled in pcmk) did things begin to work as expected.
> >
> > One suggestion earlier in this thread suggests trying the dlm
> configuration
> > of  disabling startup fencing (enable_startup_fencing=0), which sounds
> like
> > a plausible solution after looking over the logs, but I haven't tested
> > yet.
> >
> > The conclusion I'm coming to is:
> > 1. The reason DLM cannot handle resource fencing is because it keeps its
> > own "heartbeat/control" channel (for lack of a better term) via the
> > network, and pcmk cannot instruct DLM "Don't worry about that guy over
> > there" which means we must use power fencing, but;
> > 2. DLM does not like to see one of its members disappear; when that does
> > happen, DLM does "something" which causes the lockspace to disappear...
> > unless you disable fencing for DLM.
> >
> > I am now speculating that DLM restarts when the communications fail, and
> > the theory that disabling startup fencing for DLM
> > (enable_startup_fencing=0) may be the solution to my problem (reverting
> my
> > enable_fencing=0 DLM config).
> >
> > Best,
> > -Pat
> >
> > On Mon, Oct 1, 2018 at 3:38 PM Ulrich Windl <
> > Ulrich.Windl at rz.uni-regensburg.de> wrote:
> >
> >> Hi!
> >>
> >> It would be much more helpful, if you could provide logs around the
> >> problem events. Personally I think you _must_ implement proper fencing.
> In
> >> addition, DLM seems to do its own fencing when there is a communication
> >> problem.
> >>
> >> Regards,
> >> Ulrich
> >>
> >>
> >> >>> Patrick Whitney <pwhitney at luminoso.com> 01.10.18 16.25 Uhr >>>
> >> Hi Everyone,
> >>
> >> I wanted to solicit input on my configuration.
> >>
> >> I have a two node (test) cluster running corosync/pacemaker with DLM and
> >> CLVM.
> >>
> >> I was running into an issue where when one node failed, the remaining
> node
> >> would appear to do the right thing, from the pcmk perspective, that is.
> >>  It would  create a new cluster (of one) and fence the other node, but
> >> then, rather surprisingly, DLM would see the other node offline, and it
> >> would go offline itself, abandoning the lockspace.
> >>
> >> I changed my DLM settings to "enable_fencing=0", disabling DLM fencing,
> and
> >> our tests are now working as expected.
> >>
> >> I'm a little concern I have masked an issue by doing this, as in all of
> the
> >> tutorials and docs I've read, there is no mention of having to configure
> >> DLM whatsoever.
> >>
> >> Is anyone else running a similar stack and can comment?
> >>
> >> Best,
> >> -Pat
> >> --
> >> Patrick Whitney
> >> DevOps Engineer -- Tools
> >>
> >> _______________________________________________
> >> Users mailing list: Users at clusterlabs.org
> >> https://lists.clusterlabs.org/mailman/listinfo/users
> >>
> >> Project Home: http://www.clusterlabs.org
> >> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >> Bugs: http://bugs.clusterlabs.org
> >>
> >
> >
> > --
> > Patrick Whitney
> > DevOps Engineer -- Tools
>
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>


-- 
Patrick Whitney
DevOps Engineer -- Tools
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20181003/a79ebfb3/attachment-0002.html>


More information about the Users mailing list