[Pacemaker] pacemaker processes RSS growth

Andrew Beekhof andrew at beekhof.net
Wed Sep 12 03:35:06 EDT 2012


On Tue, Sep 11, 2012 at 6:14 PM, Vladislav Bogdanov
<bubble at hoster-ok.com> wrote:
> 07.09.2012 09:25, Vladislav Bogdanov wrote:
>> 06.09.2012 12:58, Vladislav Bogdanov wrote:
>> ...
>>> lrmd seems not to clean up gio channels properly:
>>
>> I prefer to call g_io_channel_unref() right after g_io_add_watch_full()
>> instead of doing so when deleting descriptor (g_source_remove() is
>> enough there). So channel will be automatically freed after watch fd is
>> removed from poll with g_source_remove(). If I need to replace watch
>> flags, I call g_io_channel_ref()/g_io_channel_unref() around
>> g_source_remove()/g_io_add_watch_full() pair. This would require some
>> care and testing though. I have spent two days on this when writing my
>> first non-blocking server based on glib and gio. This code now works
>> like a charm with approach outlined above.
>
> Does it make some sense?

Yes, I just needed to read it a few times before it sunk in :)
The interactions are pretty complex, before I commit anything I'd like
to be able to verify the refcounts are correct... how do you find out
the refcount for these glib objects?

>
> g_io_add_watch_full() refs channel, so after channel is created and its
> fd is put to mainloop with g_io_add_watch_full, channel has 2 in
> refcount. Thus, it is not freed after g_source_remove() is called once.
>
>>
>>>
>>> ==1734== 8,946 (8,520 direct, 426 indirect) bytes in 71 blocks are
>>> definitely lost in loss record 147 of 152
>>> ==1734==    at 0x4C26FDE: malloc (vg_replace_malloc.c:236)
>>> ==1734==    by 0x71997D2: g_malloc (in /lib64/libglib-2.0.so.0.2200.5)
>>> ==1734==    by 0x71C67F4: g_io_channel_unix_new (in
>>> /lib64/libglib-2.0.so.0.2200.5)
>>> ==1734==    by 0x4E52470: mainloop_add_fd (mainloop.c:660)
>>> ==1734==    by 0x5067870: services_os_action_execute (services_linux.c:456)
>>> ==1734==    by 0x403AA6: lrmd_rsc_dispatch (lrmd.c:696)
>>> ==1734==    by 0x4E513C2: crm_trigger_dispatch (mainloop.c:105)
>>> ==1734==    by 0x7190F0D: g_main_context_dispatch (in
>>> /lib64/libglib-2.0.so.0.2200.5)
>>> ==1734==    by 0x7194937: ??? (in /lib64/libglib-2.0.so.0.2200.5)
>>> ==1734==    by 0x7194D54: g_main_loop_run (in
>>> /lib64/libglib-2.0.so.0.2200.5)
>>> ==1734==    by 0x402427: main (main.c:302)
>>> ==1734==
>>> ==1734== 8,946 (8,520 direct, 426 indirect) bytes in 71 blocks are
>>> definitely lost in loss record 148 of 152
>>> ==1734==    at 0x4C26FDE: malloc (vg_replace_malloc.c:236)
>>> ==1734==    by 0x71997D2: g_malloc (in /lib64/libglib-2.0.so.0.2200.5)
>>> ==1734==    by 0x71C67F4: g_io_channel_unix_new (in
>>> /lib64/libglib-2.0.so.0.2200.5)
>>> ==1734==    by 0x4E52470: mainloop_add_fd (mainloop.c:660)
>>> ==1734==    by 0x50678AE: services_os_action_execute (services_linux.c:465)
>>> ==1734==    by 0x403AA6: lrmd_rsc_dispatch (lrmd.c:696)
>>> ==1734==    by 0x4E513C2: crm_trigger_dispatch (mainloop.c:105)
>>> ==1734==    by 0x7190F0D: g_main_context_dispatch (in
>>> /lib64/libglib-2.0.so.0.2200.5)
>>> ==1734==    by 0x7194937: ??? (in /lib64/libglib-2.0.so.0.2200.5)
>>> ==1734==    by 0x7194D54: g_main_loop_run (in
>>> /lib64/libglib-2.0.so.0.2200.5)
>>> ==1734==    by 0x402427: main (main.c:302)
>>> ==1734==
>>> ==1734== 65,394 (62,280 direct, 3,114 indirect) bytes in 519 blocks are
>>> definitely lost in loss record 151 of 152
>>> ==1734==    at 0x4C26FDE: malloc (vg_replace_malloc.c:236)
>>> ==1734==    by 0x71997D2: g_malloc (in /lib64/libglib-2.0.so.0.2200.5)
>>> ==1734==    by 0x71C67F4: g_io_channel_unix_new (in
>>> /lib64/libglib-2.0.so.0.2200.5)
>>> ==1734==    by 0x4E52470: mainloop_add_fd (mainloop.c:660)
>>> ==1734==    by 0x5067870: services_os_action_execute (services_linux.c:456)
>>> ==1734==    by 0x50676B4: recurring_action_timer (services_linux.c:212)
>>> ==1734==    by 0x719161A: ??? (in /lib64/libglib-2.0.so.0.2200.5)
>>> ==1734==    by 0x7190F0D: g_main_context_dispatch (in
>>> /lib64/libglib-2.0.so.0.2200.5)
>>> ==1734==    by 0x7194937: ??? (in /lib64/libglib-2.0.so.0.2200.5)
>>> ==1734==    by 0x7194D54: g_main_loop_run (in
>>> /lib64/libglib-2.0.so.0.2200.5)
>>> ==1734==    by 0x402427: main (main.c:302)
>>> ==1734==
>>> ==1734== 65,394 (62,280 direct, 3,114 indirect) bytes in 519 blocks are
>>> definitely lost in loss record 152 of 152
>>> ==1734==    at 0x4C26FDE: malloc (vg_replace_malloc.c:236)
>>> ==1734==    by 0x71997D2: g_malloc (in /lib64/libglib-2.0.so.0.2200.5)
>>> ==1734==    by 0x71C67F4: g_io_channel_unix_new (in
>>> /lib64/libglib-2.0.so.0.2200.5)
>>> ==1734==    by 0x4E52470: mainloop_add_fd (mainloop.c:660)
>>> ==1734==    by 0x50678AE: services_os_action_execute (services_linux.c:465)
>>> ==1734==    by 0x50676B4: recurring_action_timer (services_linux.c:212)
>>> ==1734==    by 0x719161A: ??? (in /lib64/libglib-2.0.so.0.2200.5)
>>> ==1734==    by 0x7190F0D: g_main_context_dispatch (in
>>> /lib64/libglib-2.0.so.0.2200.5)
>>> ==1734==    by 0x7194937: ??? (in /lib64/libglib-2.0.so.0.2200.5)
>>> ==1734==    by 0x7194D54: g_main_loop_run (in
>>> /lib64/libglib-2.0.so.0.2200.5)
>>> ==1734==    by 0x402427: main (main.c:302)
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org




More information about the Pacemaker mailing list