[Pacemaker] pacemaker processes RSS growth

Vladislav Bogdanov bubble at hoster-ok.com
Wed Sep 12 10:30:48 EDT 2012


12.09.2012 10:35, Andrew Beekhof wrote:
> On Tue, Sep 11, 2012 at 6:14 PM, Vladislav Bogdanov
> <bubble at hoster-ok.com> wrote:
>> 07.09.2012 09:25, Vladislav Bogdanov wrote:
>>> 06.09.2012 12:58, Vladislav Bogdanov wrote:
>>> ...
>>>> lrmd seems not to clean up gio channels properly:
>>>
>>> I prefer to call g_io_channel_unref() right after g_io_add_watch_full()
>>> instead of doing so when deleting descriptor (g_source_remove() is
>>> enough there). So channel will be automatically freed after watch fd is
>>> removed from poll with g_source_remove(). If I need to replace watch
>>> flags, I call g_io_channel_ref()/g_io_channel_unref() around
>>> g_source_remove()/g_io_add_watch_full() pair. This would require some
>>> care and testing though. I have spent two days on this when writing my
>>> first non-blocking server based on glib and gio. This code now works
>>> like a charm with approach outlined above.
>>
>> Does it make some sense?
> 
> Yes, I just needed to read it a few times before it sunk in :)
> The interactions are pretty complex, before I commit anything I'd like
> to be able to verify the refcounts are correct... how do you find out
> the refcount for these glib objects?

Frankly speaking just by try-and-fail method, while reading through its
code. glib structures contain ref_count field, it is possible to reach
it with some hacks.
The main point with gio_channel is that both g_io_channel_unix_new() and
g_io_add_watch_full() increment refcount, iirc, it is documented in glib
sources.

You can send me patches if you prefer some independent testing before
commit. Hopefully I'll be able to test them quickly.

> 
>>
>> g_io_add_watch_full() refs channel, so after channel is created and its
>> fd is put to mainloop with g_io_add_watch_full, channel has 2 in
>> refcount. Thus, it is not freed after g_source_remove() is called once.
>>
>>>
>>>>
>>>> ==1734== 8,946 (8,520 direct, 426 indirect) bytes in 71 blocks are
>>>> definitely lost in loss record 147 of 152
>>>> ==1734==    at 0x4C26FDE: malloc (vg_replace_malloc.c:236)
>>>> ==1734==    by 0x71997D2: g_malloc (in /lib64/libglib-2.0.so.0.2200.5)
>>>> ==1734==    by 0x71C67F4: g_io_channel_unix_new (in
>>>> /lib64/libglib-2.0.so.0.2200.5)
>>>> ==1734==    by 0x4E52470: mainloop_add_fd (mainloop.c:660)
>>>> ==1734==    by 0x5067870: services_os_action_execute (services_linux.c:456)
>>>> ==1734==    by 0x403AA6: lrmd_rsc_dispatch (lrmd.c:696)
>>>> ==1734==    by 0x4E513C2: crm_trigger_dispatch (mainloop.c:105)
>>>> ==1734==    by 0x7190F0D: g_main_context_dispatch (in
>>>> /lib64/libglib-2.0.so.0.2200.5)
>>>> ==1734==    by 0x7194937: ??? (in /lib64/libglib-2.0.so.0.2200.5)
>>>> ==1734==    by 0x7194D54: g_main_loop_run (in
>>>> /lib64/libglib-2.0.so.0.2200.5)
>>>> ==1734==    by 0x402427: main (main.c:302)
>>>> ==1734==
>>>> ==1734== 8,946 (8,520 direct, 426 indirect) bytes in 71 blocks are
>>>> definitely lost in loss record 148 of 152
>>>> ==1734==    at 0x4C26FDE: malloc (vg_replace_malloc.c:236)
>>>> ==1734==    by 0x71997D2: g_malloc (in /lib64/libglib-2.0.so.0.2200.5)
>>>> ==1734==    by 0x71C67F4: g_io_channel_unix_new (in
>>>> /lib64/libglib-2.0.so.0.2200.5)
>>>> ==1734==    by 0x4E52470: mainloop_add_fd (mainloop.c:660)
>>>> ==1734==    by 0x50678AE: services_os_action_execute (services_linux.c:465)
>>>> ==1734==    by 0x403AA6: lrmd_rsc_dispatch (lrmd.c:696)
>>>> ==1734==    by 0x4E513C2: crm_trigger_dispatch (mainloop.c:105)
>>>> ==1734==    by 0x7190F0D: g_main_context_dispatch (in
>>>> /lib64/libglib-2.0.so.0.2200.5)
>>>> ==1734==    by 0x7194937: ??? (in /lib64/libglib-2.0.so.0.2200.5)
>>>> ==1734==    by 0x7194D54: g_main_loop_run (in
>>>> /lib64/libglib-2.0.so.0.2200.5)
>>>> ==1734==    by 0x402427: main (main.c:302)
>>>> ==1734==
>>>> ==1734== 65,394 (62,280 direct, 3,114 indirect) bytes in 519 blocks are
>>>> definitely lost in loss record 151 of 152
>>>> ==1734==    at 0x4C26FDE: malloc (vg_replace_malloc.c:236)
>>>> ==1734==    by 0x71997D2: g_malloc (in /lib64/libglib-2.0.so.0.2200.5)
>>>> ==1734==    by 0x71C67F4: g_io_channel_unix_new (in
>>>> /lib64/libglib-2.0.so.0.2200.5)
>>>> ==1734==    by 0x4E52470: mainloop_add_fd (mainloop.c:660)
>>>> ==1734==    by 0x5067870: services_os_action_execute (services_linux.c:456)
>>>> ==1734==    by 0x50676B4: recurring_action_timer (services_linux.c:212)
>>>> ==1734==    by 0x719161A: ??? (in /lib64/libglib-2.0.so.0.2200.5)
>>>> ==1734==    by 0x7190F0D: g_main_context_dispatch (in
>>>> /lib64/libglib-2.0.so.0.2200.5)
>>>> ==1734==    by 0x7194937: ??? (in /lib64/libglib-2.0.so.0.2200.5)
>>>> ==1734==    by 0x7194D54: g_main_loop_run (in
>>>> /lib64/libglib-2.0.so.0.2200.5)
>>>> ==1734==    by 0x402427: main (main.c:302)
>>>> ==1734==
>>>> ==1734== 65,394 (62,280 direct, 3,114 indirect) bytes in 519 blocks are
>>>> definitely lost in loss record 152 of 152
>>>> ==1734==    at 0x4C26FDE: malloc (vg_replace_malloc.c:236)
>>>> ==1734==    by 0x71997D2: g_malloc (in /lib64/libglib-2.0.so.0.2200.5)
>>>> ==1734==    by 0x71C67F4: g_io_channel_unix_new (in
>>>> /lib64/libglib-2.0.so.0.2200.5)
>>>> ==1734==    by 0x4E52470: mainloop_add_fd (mainloop.c:660)
>>>> ==1734==    by 0x50678AE: services_os_action_execute (services_linux.c:465)
>>>> ==1734==    by 0x50676B4: recurring_action_timer (services_linux.c:212)
>>>> ==1734==    by 0x719161A: ??? (in /lib64/libglib-2.0.so.0.2200.5)
>>>> ==1734==    by 0x7190F0D: g_main_context_dispatch (in
>>>> /lib64/libglib-2.0.so.0.2200.5)
>>>> ==1734==    by 0x7194937: ??? (in /lib64/libglib-2.0.so.0.2200.5)
>>>> ==1734==    by 0x7194D54: g_main_loop_run (in
>>>> /lib64/libglib-2.0.so.0.2200.5)
>>>> ==1734==    by 0x402427: main (main.c:302)
>>>
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>>
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 





More information about the Pacemaker mailing list