[Pacemaker] [Partially SOLVED] pacemaker/dlm problems

Mon Jan 16 01:20:00 EST 2012

On Mon, Dec 19, 2011 at 11:11 PM, Vladislav Bogdanov
<bubble at hoster-ok.com> wrote:
> 19.12.2011 14:39, Vladislav Bogdanov wrote:
>> 09.12.2011 08:44, Andrew Beekhof wrote:
>>> On Fri, Dec 9, 2011 at 3:16 PM, Vladislav Bogdanov <bubble at hoster-ok.com> wrote:
>>>> 09.12.2011 03:11, Andrew Beekhof wrote:
>>>>> On Fri, Dec 2, 2011 at 1:32 AM, Vladislav Bogdanov <bubble at hoster-ok.com> wrote:
>>>>>> Hi Andrew,
>>>>>>
>>>>>> I investigated on my test cluster what actually happens with dlm and
>>>>>> fencing.
>>>>>>
>>>>>> I added more debug messages to dlm dump, and also did a re-kick of nodes
>>>>>> after some time.
>>>>>>
>>>>>> Results are that stonith history actually doesn't contain any
>>>>>> information until pacemaker decides to fence node itself.
>>>>>
>>>>> ...
>>>>>
>>>>>> From my PoV that means that the call to
>>>>>> crm_terminate_member_no_mainloop() does not actually schedule fencing
>>>>>> operation.
>>>>>
>>>>> You're going to have to remind me... what does your copy of
>>>>> crm_terminate_member_no_mainloop() look like?
>>>>> This is with the non-cman editions of the controlds too right?
>>>>
>>>> Just latest github's version. You changed some dlm_controld.pcmk
>>>> functionality, so it asks stonithd for fencing results instead of XML
>>>> magic. But call to crm_terminate_member_no_mainloop() remains the same
>>>> there. But yes, that version communicates stonithd directly too.
>>>>
>>>> SO, the problem here is just with crm_terminate_member_no_mainloop()
>>>> which for some reason skips actual fencing request.
>>>
>>> There should be some logs, either indicating that it tried, or that it failed.
>>
>> Nothing about fencing.
>> Only messages about history requests:
>>
>> stonith-ng: [1905]: info: stonith_command: Processed st_fence_history
>> from cluster-dlm: rc=0
>>
>> I even moved all fencing code to dlm_controld to have better control on
>> what does it do (and not to rebuild pacemaker to play with that code).
>> dlm_tool dump prints the same line every second, stonith-ng prints
>> history requests.
>>
>> A little bit odd, by I saw one time that fencing request from
>> cluster-dlm succeeded, but only right after node was fenced by
>> pacemaker. As a result, node was switched off instead of reboot.
>>
>> That raises one more question: is it correct to call st->cmds->fence()
>> with third parameter set to "off"?
>> I think that "reboot" is more consistent with the rest of fencing subsystem.
>>
>> At the same time, stonith_admin -B succeeds.
>> The main difference I see is st_opt_sync_call in a latter case.
>> Will try to experiment with it.
>
> Yeeeesssss!!!
>
> Now I see following:
> Dec 19 11:53:34 vd01-a cluster-dlm: [2474]: info:
> pacemaker_terminate_member: Requesting that node 1090782474/vd01-b be fenced

So the important question... what did you change?

> Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info:
> initiate_remote_stonith_op: Initiating remote operation reboot for
> vd01-b: 21425fc0-4311-40fa-9647-525c3f258471
> Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info: crm_get_peer: Node
> vd01-c now has id: 1107559690
> Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info: stonith_command:
> Processed st_query from vd01-c: rc=0
> Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info: crm_get_peer: Node
> vd01-d now has id: 1124336906
> Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info: stonith_command:
> Processed st_query from vd01-d: rc=0
> Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info: stonith_command:
> Processed st_query from vd01-a: rc=0
> Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info: call_remote_stonith:
> Requesting that vd01-c perform op reboot vd01-b
> Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info: crm_get_peer: Node
> vd01-b now has id: 1090782474
> ...
> Dec 19 11:53:40 vd01-a stonith-ng: [1905]: info: stonith_command:
> Processed st_fence_history from cluster-dlm: rc=0
> Dec 19 11:53:40 vd01-a crmd: [1910]: info: tengine_stonith_notify: Peer
> vd01-b was terminated (reboot) by vd01-c for vd01-a
> (ref=21425fc0-4311-40fa-9647-525c3f258471): OK
>
> But, then I see minor issue that node is marked to be fenced again:
> Dec 19 11:53:40 vd01-a pengine: [1909]: WARN: pe_fence_node: Node vd01-b
> will be fenced because it is un-expectedly down

Do you have logs for that?
tengine_stonith_notify() got called, that should have been enough to
get the node cleaned up in the cib.

> ...
> Dec 19 11:53:40 vd01-a pengine: [1909]: WARN: stage6: Scheduling Node
> vd01-b for STONITH
> ...
> Dec 19 11:53:40 vd01-a crmd: [1910]: info: te_fence_node: Executing
> reboot fencing operation (249) on vd01-b (timeout=60000)
> ...
> Dec 19 11:53:40 vd01-a stonith-ng: [1905]: info: call_remote_stonith:
> Requesting that vd01-c perform op reboot vd01-b
>
> And so on.
>
> I can't investigated this one in more depth, because I use fence_xvm in
> this testing cluster, and it has issues when running more than one
> stonith resource on a node. Also, my RA (in a cluster where this testing
> cluster runs) undefines VM after failure, so fence_xvm does not see
> fencing victim in a qpid and is unable to fence it again.
>
> May be it is possible to look if node was just fenced and skip redundant
> fencing?

If the callbacks are being used correctly, it shouldn't be required