[Pacemaker] lrmadmin -C blocks on subsequent invocations

Dave Williams dave at opensourcesolutions.co.uk
Tue Nov 23 17:36:07 EST 2010


On 21:59, Mon 22 Nov 10, Dave Williams wrote:
> backtrace from gdb shows lrmd to be in a lock_wait
> #0  0x00007f7e5f8ba6b4 in __lll_lock_wait () from /lib/libpthread.so.0
> #1  0x00007f7e5f8b5849 in _L_lock_953 () from /lib/libpthread.so.0
> #2  0x00007f7e5f8b566b in pthread_mutex_lock () from
> /lib/libpthread.so.0
> #3  0x00007f7e601b0806 in g_main_context_find_source_by_id () from
> /lib/libglib-2.0.so.0
> #4  0x00007f7e601b08fe in g_source_remove () from /lib/libglib-2.0.so.0
> #5  0x00007f7e61568ba1 in G_main_del_IPC_Channel (chp=0x11deed0) at
> GSource.c:495
> #6  0x00000000004065a1 in on_remove_client (user_data=0x11df8e0) at
> lrmd.c:1526
> #7  0x00007f7e615694ca in G_CH_destroy_int (source=0x11deed0) at
> GSource.c:675
> #8  0x00007f7e601adc11 in ?? () from /lib/libglib-2.0.so.0
> #9  0x00007f7e601ae428 in g_main_context_dispatch () from
> /lib/libglib-2.0.so.0
> #10 0x00007f7e601b22a8 in ?? () from /lib/libglib-2.0.so.0
> #11 0x00007f7e601b27b5 in g_main_loop_run () from /lib/libglib-2.0.so.0
> #12 0x0000000000405d32 in init_start () at lrmd.c:1267
> #13 0x0000000000404f7a in main (argc=1, argv=0x7fff91e24478) at
> lrmd.c:835
> 

OK - what I understand having spent an evening looking at the source
code is that upon lrmadmin client disconnecting from lrmd's cmd socket
(having got what it needs) lrmd is left to tidy up by deleting the client
event source from the GMainContext GLib loop. It is in the process of
calling g_source_remove() which then hangs deep inside GLib on a mutex
lock. 

On the surface the overall sequence makes sense but the hang doesnt and
clearly shouldnt happen. I am at a loss as to whether it is a GLib
issues (unlikely I would have thought?) or its an lrmd bug.

lrmd should NEVER hang! Can anyone help?

Are there any other mailing lists I can try??

Thanks
Dave






More information about the Pacemaker mailing list