[Pacemaker] High CIB load on DC election

Andrew Beekhof andrew at beekhof.net
Mon Sep 29 03:12:59 EDT 2014


On 22 Sep 2014, at 11:22 pm, Cédric Dufour - Idiap Research Institute <cedric.dufour at idiap.ch> wrote:

> Hello again,
> 
> My PM 1.1.12 cluster is quite large: 22 nodes, ~300 resources.
> 
> When gracefully shutting down the current DC (iow. move resources elsewhere, node standby, pacemaker stop, corosync stop) the CIB load increases - on the slowest nodes to close to 100% - until the new DC gets elected.
> What explains this phenomenom ?

Not a lot, since the DC doesn't co-ordinate cib writes in 1.1.12 and is generally far less CPU hungry.
If you had said the crmd load goes crazy, that would be different.

Logs?

> (What could I do to limit/circumvent it ?)
> 
> In parallel, when this happens and on those nodes that display the "throttle_mode: High CIB load detected" message, my "ping" (network connectivity) RA times out without obvious explanation (the RA timeout is conservative enough, compared to the ping timeout/attempts, so that it should never kick in). Looking at the code of the ".../resource.d/pacemaker/ping", I suspect - though I may be wrong - the culprit is "attrd_updater".
> Hypothesis: "attrd_updater" doesn't return immediately, as it is supposed to do, because of the high CIB load.
> Does this hypothesis make sense ?

Certainly plausible.  I'd like to see logs before I guess further though.

> (PS: it is very difficult for me to reproduce/debug this issue, showing up on my production cluster, without risking to wreak havoc with my services)
> 
> Thank you very much for your response(s)
> 
> Best,
> 
> Cédric
> 
> -- 
> Cédric Dufour @ Idiap Research Institute
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140929/1a8e6c3c/attachment-0003.sig>


More information about the Pacemaker mailing list