[ClusterLabs] Corosync CPU load slowly increasing if one node present

Stefan Kohlhauser stefan.lists at gmx.net
Thu Apr 27 10:13:33 CEST 2017


Hello everyone!

I am using Pacemaker (1.1.12), Corosync (2.3.0) and libqb (0.16.0) in 2-node clusters (virtualized in VMware infrastructure, OS: RHEL 6.7).
I noticed that if only one node is present, the CPU usage of Corosync (as seen with top) is slowly but steadily increasing (over days; in my setting about 1% per day). The node is basically idle, some Pacemaker managed resources are running but they are not contacted by any clients.
I upgraded a test stand-alone node to Corosync (2.4.2) and libqb (1.0.1) (which at least made the memleak go away), but the CPU usage is still increasing on the node.
When I add a second node to the cluster, the CPU load drops back down to a normal (low) CPU usage.
I haven't witnessed the increasing CPU load yet if two nodes were present in a cluster.

Even if running Pacemaker/Corosync as a massive-overkill-Monit-replacement is questionable, the observed CPU-load is not what I expect to happen.

What could be the reason for this CPU-load increase? Is there a rational behind this?
Is this a config thing or something in the binaries?

BR, Stefan

My corosync.conf:

# Please read the corosync.conf.5 manual page
compatibility: whitetank

aisexec {
        user:root
        group:root
}

totem {
        version: 2

        # Security configuration
        secauth: on
        threads: 0

        # Timeout for token
        token: 1000
        token_retransmits_before_loss_const: 4

        # Number of messages that may be sent by one processor on receipt of the token
        max_messages: 20

        # How long to wait for join messages in the membership protocol (ms)
        join: 50
        consensus: 1200

        # Turn off the virtual synchrony filter
        vsftype: none

        # Stagger sending the node join messages by 1..send_join ms
        send_join: 50

        # Limit generated nodeids to 31-bits (positive signed integers)
        clear_node_high_bit: yes

        # Interface configuration
        rrp_mode: passive
        interface {
                ringnumber: 0
                bindnetaddr: 10.20.30.0
                mcastaddr: 226.95.30.100
                mcastport: 5510
        }
        interface {
                ringnumber: 1
                bindnetaddr: 10.20.31.0
                mcastaddr: 226.95.31.100
                mcastport: 5510
        }
}

logging {
        fileline: off
        to_stderr: no
        to_logfile: no
        to_syslog: yes
        syslog_facility: local3
        debug: off
}

amf {
        mode: disabled
}

quorum {
        provider: corosync_votequorum
        expected_votes: 1
}



More information about the Users mailing list