[Pacemaker] lrmd hanging

coredump jose.junior at gmail.com
Wed Nov 30 13:16:40 UTC 2011


So last night I was supposed to get a cluster running, everything
worked ok on a virtual environment using the same software and by my
experience I only had to install pacemaker and corosync (from the
ubuntu 10.04 ppa) and get it rolling. What really happened was: I
could use crm configure to set properties to the cluster like resource
stickiness and quorum and disable stonith. When I tried to add
primitives, the crm just hang there, without returning an error or
completing.
I noticed those two entries in the log, everytime crm tries to
configure something the first time:

Nov 30 05:33:26 server lrmd: [18102]: debug: on_msg_register:client
lrmadmin [18159] registered
Nov 30 05:33:26 server lrmd: [18102]: debug: on_receive_cmd: the IPC
to client [pid:18159] disconnected.

Also, when I stop corosync it sends a TERM signal for lrmd but it
doesn't exit, even after some minutes, I have to kill -9 it. I tried
to strace lrmd but it's stuck on a FUTEX that really doesn't really
help a lot:

Process 32764 attached - interrupt to quit
futex(0xe070d8, FUTEX_WAIT_PRIVATE, 2, NULL^C <unfinished ...>

Anyone has any idea what would make lrmd to just hang?

[]s
core




More information about the Pacemaker mailing list