[Pacemaker] getting started - crm hangs when adding resources, even "crm ra classes" hangs

Florian Haas florian at hastexo.com
Wed Mar 14 12:33:43 EDT 2012


On Wed, Mar 14, 2012 at 4:58 PM, Phillip Frost
<phil at macprofessionals.com> wrote:
>> Can you confirm that you're running the ~bpo60+2 (note trailing "2")
>> build, that you're actually running an lrmd binary from that version
>> (meaning: that you properly killed your lrmd prior to installing that
>> package), _and_ that "lrmadmin -
>> C" does *not* list "upstart?
>
> Let's discard all of my previous conclusions. Apparently I was confused.
>
> Now, I'm sure I'm running +2 on all three nodes. And, I restarted pacemaker and corosync on all the nodes. I'm basing my knowledge of what versions I'm running on apt-cache policy, output copied below.

"dpkg -l <package>" would also tell you what versions you have
installed, in a more concise fashion.

> I can confirm that lrmadmin -C does not list upstart (also below). Nor does it leak sockets, as reported by "lsof -f | grep lrm_callback_sock".

Yep, no surprise here.

> However, sometimes pacemakerd will not stop cleanly.

OK. Whether this is related to your original problem or not a complete
open question, jftr.

> I thought it might happen when stopping pacemaker on the current DC, but after successfully reproducing this failure twice, I couldn't do it again. Pacemakerd seems to exit, but fail to notify the other nodes of its shutdown. Syslog is flooded with "Retransmit List" messages (log attached). These persist until I stop corosync. Asked immediately after stopping pacemaker and corosync on one node, "crm status" other nodes will report that node as still online. After a while, the stopped node switches to offline; I assume some timeout is expiring and they are assuming it crashed.

You didn't give much other information, so I'm asking this on a hunch:
does your pacemaker service configuration stanza for corosync (either
in /etc/corosync/corosync.conf or in
/etc/corosync/service.d/pacemaker) say "ver: 0" or "ver: 1"?

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now




More information about the Pacemaker mailing list