[Pacemaker] pacemaker/dlm problems

Mon Sep 26 08:16:00 UTC 2011

On Mon, Sep 26, 2011 at 5:38 PM, Vladislav Bogdanov
<bubble at hoster-ok.com> wrote:
> Hi Andrew,
>
> 26.09.2011 10:10, Andrew Beekhof wrote:
>> On Tue, Sep 6, 2011 at 5:27 PM, Vladislav Bogdanov <bubble at hoster-ok.com> wrote:
>>> Hi Andrew, hi all,
>>>
>>> I'm further investigating dlm lockspace hangs I described in
>>> https://www.redhat.com/archives/cluster-devel/2011-August/msg00133.html
>>> and in the thread starting from
>>> https://lists.linux-foundation.org/pipermail/openais/2011-September/016701.html
>>> .
>>>
>>> What I described there is setup which involves pacemaker-1.1.6 with
>>> corosync-1.4.1 and dlm_controld.pcmk from cluster-3.0.17 (without cman).
>>> I use openais stack for pacemaker.
>>>
>>> I found that it is possible to reproduce dlm kern_stop state across a
>>> whole cluster with iptables on just one node, it is sufficient to block
>>> all (or just corosync-specific) incoming/outgoing UDP for several
>>> seconds (that time probably depends on corosync settings). I my case I
>>> reproduced hang with 3-seconds traffic block:
>>> iptables -I INPUT 1 -p udp -j REJECT; \
>>> iptables -I OUTPUT 1 -p udp -j REJECT; \
>>> sleep 3; \
>>> iptables -D INPUT 1; \
>>> iptables -D OUTPUT 1
>>>
>>> I tried to make dlm_controld schedule fencing on CPG_REASON_NODEDOWN
>>> event (just to look if it helps with problems I described in posts
>>> referenced above), but without much success, following code does not work:
>>>
>>>    int fd = pcmk_cluster_fd;
>>>    int rc = crm_terminate_member_no_mainloop(nodeid, NULL, &fd);
>>>
>>> I get "Could not kick node XXX from the cluster" message accompanied
>>> with "No connection to the cluster". That means that
>>> attrd_update_no_mainloop() fails.
>>>
>>> Andrew, could you please give some pointers why may it fail? I'd then
>>> try to fix dlm_controld. I do not see any other uses of that function
>>> except than in dlm_controld.pcmk.
>>
>> I can't think of anything except that attrd might not be running.  Is it?
>
> Will recheck.
>
>>
>> Regardless, for 1.1.6 the dlm would be better off making a call like:
>>
>>           rc = st->cmds->fence(st, st_opts, target, "reboot", 120);
>>
>> from fencing/admin.c
>>
>> That would talk directly to the fencing daemon, bypassing attrd, crnd
>> and PE - and thus be more reliable.
>>
>> This is what the cman plugin will be doing soon too.
>
> Great to know, I'll try that in near future. Thank you very much for
> pointer.

1.1.7 will actually make use of this API regardless of any *_controld
changes - i'm in the middle of updating the two library functions they
use (crm_terminate_member and crm_terminate_member_no_mainloop).

>
>>
>>>
>>> I agree with Jiaju
>>> (https://lists.linux-foundation.org/pipermail/openais/2011-September/016713.html),
>>> that could be solely pacemaker problem, because it probably should
>>> originate fencing itself is such situation I think.
>>>
>>> So, using pacemaker/dlm with openais stack is currently risky due to
>>> possible hangs of dlm_lockspaces.
>>
>> It shouldn't be, failing to connect to attrd is very unusual.
>
> By the way, one of underlying problems, which actually made me to notice
> all this, is that pacemaker cluster does not fence its DC if it leaves
> the cluster for a very short time. That is what Jiaju told in his notes.
> And I can confirm that.

Thats highly surprising.  Do the logs you sent display this behaviour?

>
>>
>>> Originally I got it due to heavy load
>>> on one cluster nodes (actually on a host which has that cluster node
>>> running as virtual guest).
>>>
>>> Ok, I switched to cman to see if it helps. Fencing is configured in
>>> pacemaker, not in cluster.conf.
>>>
>>> Things became even worse ;( .
>>>
>>> Although it took 25 seconds instead of 3 to break the cluster (I
>>> understand, this is almost impossible to load host so much, but
>>> anyways), then I got a real nightmare: two nodes of 3-node cluster had
>>> cman stopped (and pacemaker too because of cman connection loss) - they
>>> asked to kick_node_from_cluster() for each other, and that succeeded.
>>> But fencing didn't happen (I still need to look why, but this is cman
>>> specific).
>>> Remaining node had pacemaker hanged, it doesn't even
>>> notice cluster infrastructure change, down nodes were listed as a
>>> online, one of them was a DC, all resources are marked as started on all
>>> (down too) nodes. No log entries from pacemaker at all.
>>
>> Well I can't see any logs from anyone to its hard for me to comment.
>
> Logs are sent privately.
>
>>
>>> So, from my PoV cman+pacemaker is not currently suitable for HA tasks too.
>>>
>>> That means that both possible alternatives are currently unusable if one
>>> needs self-repairing pacemaker cluster with dlm support ;( That is
>>> really regrettable.
>>>
>>> I can provide all needed information and really hope that it is possible
>>> to fix both issues:
>>> * dlm blockage with openais and
>>> * pacemaker lock with cman and no fencing from within dlm_controld
>>>
>>> I think both issues are really high priority, because it is definitely
>>> not acceptable when problems with load on one cluster node (or with link
>>> to that node) lead to a total cluster lock or even crash.
>>>
>>> I also offer any possible assistance from my side (f.e. patch trials
>>> etc.) to get that all fixed. I can run either openais or cman and can
>>> quickly switch between that stacks.
>>>
>>> Sorry for not being brief,
>>>
>>> Best regards,
>>> Vladislav
>>>
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>