[Pacemaker] pacemaker/dlm problems
Andrew Beekhof
andrew at beekhof.net
Sun Oct 2 21:41:38 EDT 2011
On Tue, Sep 27, 2011 at 6:24 PM, Vladislav Bogdanov
<bubble at hoster-ok.com> wrote:
> 27.09.2011 10:56, Andrew Beekhof wrote:
>> On Tue, Sep 27, 2011 at 5:07 PM, Vladislav Bogdanov
>> <bubble at hoster-ok.com> wrote:
>>> 27.09.2011 08:59, Andrew Beekhof wrote:
>>> [snip]
>>>>>>>>> I agree with Jiaju
>>>>>>>>> (https://lists.linux-foundation.org/pipermail/openais/2011-September/016713.html),
>>>>>>>>> that could be solely pacemaker problem, because it probably should
>>>>>>>>> originate fencing itself is such situation I think.
>>>>>>>>>
>>>>>>>>> So, using pacemaker/dlm with openais stack is currently risky due to
>>>>>>>>> possible hangs of dlm_lockspaces.
>>>>>>>>
>>>>>>>> It shouldn't be, failing to connect to attrd is very unusual.
>>>>>>>
>>>>>>> By the way, one of underlying problems, which actually made me to notice
>>>>>>> all this, is that pacemaker cluster does not fence its DC if it leaves
>>>>>>> the cluster for a very short time. That is what Jiaju told in his notes.
>>>>>>> And I can confirm that.
>>>>>>
>>>>>> Thats highly surprising. Do the logs you sent display this behaviour?
>>>>>
>>>>> They do. Rest of the cluster begins the election, but then accepts
>>>>> returned DC back (I write this from memory, I looked at logs Sep 5-6, so
>>>>> I may mix up something).
>>>>
>>>> Actually, this might be possible - if DC.old came back before DC.new
>>>> had a chance to get elected, run the PE and initiate fencing, then
>>>> there would be no need to fence.
>>>>
>>>
>>> (text below is for pacemaker on top of openais stack, not for cman)
>>>
>>> Except dlm lockspaces are in kern_stop state, so a whole dlm-related
>>> part is frozen :( - clvmd in my case, but I expect the same from gfs2
>>> and ocfs2.
>>> And fencing requests originated on CPG NODEDOWN event by dlm_controld
>>> (with my patch to dlm_controld and your patch for
>>> crm_terminate_member_common()) on a quorate partition are lost. DC.old
>>> doesn't accept CIB updates from other nodes, so that fencing requests
>>> are discarded.
>>
>> All the more reason to start using the stonith api directly.
>> I was playing around list night with the dlm_controld.pcmk code:
>> https://github.com/beekhof/dlm/commit/9f890a36f6844c2a0567aea0a0e29cc47b01b787
>
> Wow, I'll try it!
>
> Btw (offtopic), don't you think that it could be interesting to have
> stacks support in dlopened modules there? From what I see in that code,
> it could be almost easily achieved. One just needs to create module API
> structure, enumerate functions in each stack, add module loading to
> dlm_controld core and change calls to module functions.
I'm sure its possible. Just up to David if he wants to support it.
>
>>
>>>
>>> I think that problem is that membership changes are handled in a
>>> non-transactional way (?).
>>
>> Sounds more like the dlm/etc is being dumb - if the host is back and
>> healthy, why would we want to shoot it?
>
> Ammmm..... No comments from me on this ;)
>
> But, anyways, something needs to be done at either side...
>
>>
>>> If pacemaker fully finish processing of one membership change - elect
>>> new DC on a quorate partition, and do not try to take over dc role (or
>>> release it) on a non-quorate partition if quorate one exists, that
>>> problem could be gone.
>>
>> Non quorate partitions still have a DC.
>> They're just not supposed to do anything (depending on the value of
>> no-quorum-policy).
>
> I actually meant "do not try to take over dc role in a rejoined cluster
> (or release that role) if it was running on a non-quorate partition
> before rejoin if quorate one existed".
All existing DC's give up the role and a new one is elected when two
partitions join.
So I'm unsure what you're referring to here :-)
> Sorry for confusion. Not very
> natural wording again, but should be better.
>
> May be DC from non-quorate partition should just have lower priority to
> become DC when cluster rejoins and new election happen (does it?)?
There is no bias towards past DCs in the election.
More information about the Pacemaker
mailing list