[ClusterLabs] [Question] The memory which crmd uses increases at every re-check.

Fri Mar 27 08:07:47 UTC 2015

27.03.2015 09:09, renayama19661014 at ybb.ne.jp wrote:
> Hi All,
>
> This memory increase seems to stop somehow or other at some point in time.
> I checked it, but this is like the increase by IPC communication (mmap) of libqb.
> Is my opinion right?

I've been testing pacemaker memory consumption a year or so ago, and got 
completely the same results. Pacemaker uses libqb's shared-memory 
buffers for inter-process communication, and each that buffer is 
allocated step-by-step until a predefined limit (5m that time I think) 
is reached. After that processes do not grow anymore in the otherwise 
static cluster. On the one hand that makes memleak checks a little bit 
complicated task, on the other it leads to a performance boost. Anyways, 
on linux you can parse 'smaps' pseudo-file in /proc/<pid> to get 
detailed map of the memory allocation. The part where application 
memleaks are usually in is marked as [heap].

Best,
Vladislav

>
> Best Regards,
> Hideo Yamauchi.
>
>
>
> ----- Original Message -----
>> From: "renayama19661014 at ybb.ne.jp" <renayama19661014 at ybb.ne.jp>
>> To: ClusterLabs-ML <users at clusterlabs.org>
>> Cc:
>> Date: 2015/3/23, Mon 10:48
>> Subject: [ClusterLabs] [Question] The memory which crmd uses increases at every re-check.
>>
>> Hi All,
>>
>> We operated pacemaker for several days and confirmed increase of the memory of
>> the crmd process.
>>
>> The constitution is the following simplicity.
>> In addition, the node is one.
>>
>> ---------------------------------
>> [root at snmp1 ~]# cat /etc/redhat-release
>> Red Hat Enterprise Linux Server release 6.5 (Santiago)
>>
>> [root at snmp1 ~]# crm_mon -1 -Af
>> Last updated: Mon Mar 23 08:50:10 2015
>> Last change: Fri Mar 20 13:19:46 2015
>> Stack: corosync
>> Current DC: snmp1 (3232238180) - partition WITHOUT quorum
>> Version: 1.1.12-e32080b
>> 1 Nodes configured
>> 7 Resources configured
>>
>>
>> Online: [ snmp1 ]
>>
>>   Resource Group: grpNFSclient
>>       prmVIPcheck        (ocf::heartbeat:Dummy): Started snmp1
>>       prmIpNFSclient     (ocf::heartbeat:Dummy): Started snmp1
>>       prmFsNFSclient     (ocf::heartbeat:Dummy): Started snmp1
>>       prmInitRpcidmapd   (ocf::heartbeat:Dummy): Started snmp1
>>   Clone Set: clnDiskd [prmDiskd]
>>       Started: [ snmp1 ]
>>   Clone Set: clnPing [prmPing]
>>       Started: [ snmp1 ]
>>   Clone Set: clnRpcbind [prmRpcbind]
>>       Started: [ snmp1 ]
>>
>> Node Attributes:
>> * Node snmp1:
>>      + default_ping_set                  : 100
>>      + diskcheck_status_internal         : normal
>>      + ringnumber_0                      : 192.168.10.100 is UP
>>      + ringnumber_1                      : 192.168.20.100 is UP
>>
>> Migration summary:
>> * Node snmp1:
>> ---------------------------------
>>
>> But I shorten the time for re-check.
>>
>> ---------------------------------
>> (snip)
>> property no-quorum-policy="ignore" \
>>          stonith-enabled="false" \
>>          cluster-recheck-interval="5s"
>> (snip)
>> ---------------------------------
>>
>> The memory to use of crmd increases at every re-check somehow or other.(RSS
>> Size)
>>
>> ---------------------------------
>> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
>> Fri Mar 20 13:20:03 JST 2015
>> 496      24010  0.1  0.3 152944  8068 ?        Ss   13:19   0:00
>> /usr/libexec/pacemaker/crmd
>>
>> --------------
>> Fri Mar 20 13:21:56 JST 2015
>> 496      24010  0.0  0.4 152944  8744 ?        Ss   13:19   0:00
>> /usr/libexec/pacemaker/crmd
>> --------------
>> Fri Mar 20 13:32:15 JST 2015
>> 496      24010  0.0  0.5 152944 10712 ?        Ss   13:19   0:00
>> /usr/libexec/pacemaker/crmd
>> --------------
>> Fri Mar 20 14:44:57 JST 2015
>> 496      24010  0.0  0.7 152944 14256 ?        Ss   13:19   0:04
>> /usr/libexec/pacemaker/crmd
>> --------------
>> Fri Mar 20 15:19:30 JST 2015
>> 496      24010  0.0  0.7 152944 14564 ?        Ss   13:19   0:06
>> /usr/libexec/pacemaker/crmd
>> --------------
>> Mon Mar 23 08:47:52 JST 2015
>> 496      24010  0.0  0.9 152944 19100 ?        Ss   Mar20   3:25
>> /usr/libexec/pacemaker/crmd
>> [root at snmp1 ~]# date;free
>> Mon Mar 23 09:01:47 JST 2015
>>               total       used       free     shared    buffers     cached
>> Mem:       2029900    1255892     774008          0     225956     825204
>> -/+ buffers/cache:     204732    1825168
>> Swap:      1048568          0    1048568
>> --------------
>> Mon Mar 23 10:32:51 JST 2015
>> 496      24010  0.0  0.9 152944 19104 ?        Ss   Mar20   3:52
>> /usr/libexec/pacemaker/crmd
>> [root at snmp1 ~]# date; free
>> Mon Mar 23 10:34:09 JST 2015
>>               total       used       free     shared    buffers     cached
>> Mem:       2029900    1264108     765792          0     225996     833128
>> -/+ buffers/cache:     204984    1824916
>> Swap:      1048568          0    1048568
>> ---------------------------------
>>
>> The memory increases after this.
>>   * The distance of the increase becomes the degree for 1-2 hours.
>>
>>
>> The increase of this memory seems to be seen in other processes.
>> Does the increase of this memory not have any problem?
>> Which processing is the increase of this memory caused by?
>>
>> Best Regards,
>> Hideo Yamauchi.
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> http://clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>