[ClusterLabs] Syncing data and reducing CPU utilization of cib process

Nikhil Utane nikhil.subscribed at gmail.com
Mon Apr 3 12:14:21 CEST 2017


Here's the snapshot. As seen below, the messages are coming at more than a
second frequency.
I checked that the cib.xml file was not updated (no change to timestamp of
file)
Then i took tcpdump and did not see any message other than keep-alives.
Is the cib process looping incorrectly?
Can share strace output if required.

Apr 03 14:48:28 [6372] 0005B932ED72        cib:     info:
crm_compress_string:  Compressed 427943 bytes into 13559 (ratio 31:1ms
Apr 03 14:48:29 [6372] 0005B932ED72        cib:     info:
crm_compress_string:  Compressed 427943 bytes into 13536 (ratio 31:1ms
Apr 03 14:48:29 [6372] 0005B932ED72        cib:     info:
crm_compress_string:  Compressed 427943 bytes into 13551 (ratio 31:1ms
Apr 03 14:48:30 [6372] 0005B932ED72        cib:     info:
crm_compress_string:  Compressed 427943 bytes into 13552 (ratio 31:1ms
Apr 03 14:48:31 [6372] 0005B932ED72        cib:     info:
crm_compress_string:  Compressed 427943 bytes into 13537 (ratio 31:1ms
Apr 03 14:48:32 [6372] 0005B932ED72        cib:     info:
crm_compress_string:  Compressed 427943 bytes into 13534 (ratio 31:1ms
Apr 03 14:48:32 [6372] 0005B932ED72        cib:     info:
crm_compress_string:  Compressed 427943 bytes into 13546 (ratio 31:1ms

-Regards
Nikhil

On Mon, Apr 3, 2017 at 11:38 AM, Nikhil Utane <nikhil.subscribed at gmail.com>
wrote:

> Ken,
>
> The CIB file is not being updated that often.
> I took a packet capture and don't see the node sending any message to
> other nodes (other than keep-alives).
> What then explains these messages coming every second?
>
> -Regards
> Nikhil
>
> On Sat, Apr 1, 2017 at 1:37 AM, Ken Gaillot <kgaillot at redhat.com> wrote:
>
>> On 03/31/2017 06:44 AM, Nikhil Utane wrote:
>> > We are seeing this log in pacemaker.log continuously.
>> >
>> > Mar 31 17:13:01 [6372] 0005B932ED72        cib:     info:
>> > crm_compress_string:  Compressed 436756 bytes into 14635 (ratio 29:1) in
>> > 284ms
>> >
>> > This looks to be the reason for high CPU. What does this log indicate?
>>
>> If a cluster message is larger than 128KB, pacemaker will compress it
>> (using BZ2) before transmitting it across the network to the other
>> nodes. This can hit the CPU significantly. Having a large resource
>> definition makes such messages more common.
>>
>> There are many ways to sync a configuration file between nodes. If the
>> configuration rarely changes, a simple rsync cron could do it.
>> Specialized tools like lsyncd are more responsive while still having a
>> minimal footprint. DRBD or shared storage would be more powerful and
>> real-time. If it's a custom app, you could even modify it to use
>> something like etcd or a NoSQL db.
>>
>> >
>> > -Regards
>> > Nikhil
>> >
>> >
>> > On Fri, Mar 31, 2017 at 12:08 PM, Nikhil Utane
>> > <nikhil.subscribed at gmail.com <mailto:nikhil.subscribed at gmail.com>>
>> wrote:
>> >
>> >     Hi,
>> >
>> >     In our current design (which we plan to improve upon) we are using
>> >     the CIB file to synchronize information across active and standby
>> nodes.
>> >     Basically we want the standby node to take the configuration that
>> >     was used by the active node so we are adding those as resource
>> >     attributes. This ensures that when the standby node takes over, it
>> >     can read all the configuration which will be passed to it as
>> >     environment variables.
>> >     Initially we thought the list of configuration parameters will be
>> >     less and we did some prototyping and saw that there wasn't much of
>> >     an issue. But now the list has grown it has become close to 300
>> >     attributes. (I know this is like abusing the feature and we are
>> >     looking towards doing it the right way).
>> >
>> >     So I have two questions:
>> >     1) What is the best way to synchronize such kind of information
>> >     across nodes in the cluster? DRBD? Anything else that is simpler?
>> >     For e.g. instead of syncing 300 attributes i could just sync up the
>> >     path to a file.
>> >
>> >     2) In the current design, is there anything that I can do to reduce
>> >     the CPU utilization of cib process? Currently it regularly takes
>> >     30-50% of the CPU.
>> >     Any quick fix that I can do which will bring it down? For e.g.
>> >     configure how often it synchronizes etc?
>> >
>> >     -Thanks
>> >     Nikhil
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> http://lists.clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20170403/73df7acd/attachment.html>


More information about the Users mailing list