[ClusterLabs] Antw: Re: Antw: Any CLVM/DLM users around?

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Tue Oct 2 02:25:31 EDT 2018


Hi!

I'm sorry that DLM/cLVM does not work for you. Did you double-check the configuration (meta interleave=true, colocation and ordering), especially the clones?
Also as you have shared storage, why don't you use SBD for fencing?

Regards,
Ulrich

 
>>> Patrick Whitney <pwhitney at luminoso.com> schrieb am 01.10.2018 um 22:01 in
Nachricht
<CAE0zLk_Va6gtHz9tG3woEcua2RidaehaOQ8ieQdZ4MeOkcy0nQ at mail.gmail.com>:
> Hi Ulrich,
> 
> When I first encountered this issue, I posted this:
> 
> https://lists.clusterlabs.org/pipermail/users/2018-September/015637.html 
> 
> ... I was using resource fencing in this example, but, as I've mentioned
> before, the issue would come about, not when fencing occurred, but when the
> fenced node was shutdown (we were using resource fencing).
> 
> During that discussion, yourself and others suggested that power fencing
> was the only way DLM was going to cooperate and one suggestion of using
> meatware was proposed.
> 
> Unfortunately, I found out later that meatware was no longer available (
> https://lists.clusterlabs.org/pipermail/users/2018-September/015715.html),
> so we were lucky enough our test environment is a KVM/libvirt environment,
> so I used fence_virsh.  Again, I had the same problem... when the "bad"
> node was fenced, dlm_controld would issue (what appears to be) a fence_all,
> and I would receive messages that that the dlm clone was down on all
> members and would have a log message that the clvm lockspace was
> abandoned.
> 
> It was only when I disabled fencing for dlm (enable_fencing=0 in dlm.conf;
> but kept fencing enabled in pcmk) did things begin to work as expected.
> 
> One suggestion earlier in this thread suggests trying the dlm configuration
> of  disabling startup fencing (enable_startup_fencing=0), which sounds like
> a plausible solution after looking over the logs, but I haven't tested
> yet.
> 
> The conclusion I'm coming to is:
> 1. The reason DLM cannot handle resource fencing is because it keeps its
> own "heartbeat/control" channel (for lack of a better term) via the
> network, and pcmk cannot instruct DLM "Don't worry about that guy over
> there" which means we must use power fencing, but;
> 2. DLM does not like to see one of its members disappear; when that does
> happen, DLM does "something" which causes the lockspace to disappear...
> unless you disable fencing for DLM.
> 
> I am now speculating that DLM restarts when the communications fail, and
> the theory that disabling startup fencing for DLM
> (enable_startup_fencing=0) may be the solution to my problem (reverting my
> enable_fencing=0 DLM config).
> 
> Best,
> -Pat
> 
> On Mon, Oct 1, 2018 at 3:38 PM Ulrich Windl <
> Ulrich.Windl at rz.uni-regensburg.de> wrote:
> 
>> Hi!
>>
>> It would be much more helpful, if you could provide logs around the
>> problem events. Personally I think you _must_ implement proper fencing. In
>> addition, DLM seems to do its own fencing when there is a communication
>> problem.
>>
>> Regards,
>> Ulrich
>>
>>
>> >>> Patrick Whitney <pwhitney at luminoso.com> 01.10.18 16.25 Uhr >>>
>> Hi Everyone,
>>
>> I wanted to solicit input on my configuration.
>>
>> I have a two node (test) cluster running corosync/pacemaker with DLM and
>> CLVM.
>>
>> I was running into an issue where when one node failed, the remaining node
>> would appear to do the right thing, from the pcmk perspective, that is.
>>  It would  create a new cluster (of one) and fence the other node, but
>> then, rather surprisingly, DLM would see the other node offline, and it
>> would go offline itself, abandoning the lockspace.
>>
>> I changed my DLM settings to "enable_fencing=0", disabling DLM fencing, and
>> our tests are now working as expected.
>>
>> I'm a little concern I have masked an issue by doing this, as in all of the
>> tutorials and docs I've read, there is no mention of having to configure
>> DLM whatsoever.
>>
>> Is anyone else running a similar stack and can comment?
>>
>> Best,
>> -Pat
>> --
>> Patrick Whitney
>> DevOps Engineer -- Tools
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org 
>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>
>> Project Home: http://www.clusterlabs.org 
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>> Bugs: http://bugs.clusterlabs.org 
>>
> 
> 
> -- 
> Patrick Whitney
> DevOps Engineer -- Tools






More information about the Users mailing list