[Pacemaker] Primitive stuck after resource agent failure?

Jody McIntyre jodym at trustcentric.com
Fri Feb 18 10:59:52 EST 2011


[Sorry for the partial message I sent earlier.  Here's the full one.]

I am attempting to write my own resource agent to support postgres WAL log
shipping.  My PostgreSQL primitive is currently stuck in a FAILED state due
to a bug in the resource agent script that I have fixed, and I can't figure
out how to get the primitive working again.

I tried moving it to another node:
root at trustcentric2:~# crm resource move PostgreSQL trustcentric1

This does not give an error, but the primitive is still on trustcentric2:

root at trustcentric1:~# crm_mon -1
============
Last updated: Fri Feb 18 07:49:13 2011
Stack: Heartbeat
Current DC: trustcentric2 (28ebee49-31c7-419e-a29a-c939c3a241bd) - partition
with quorum
Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd
2 Nodes configured, unknown expected votes
2 Resources configured.
============

Online: [ trustcentric1 trustcentric2 ]

 ClusterIP      (ocf::heartbeat:IPaddr2) Started [      trustcentric1
trustcentric2 ]
 PostgreSQL     (ocf::trustcentric:postgresql): Started trustcentric2
(unmanaged) FAILED

Failed actions:
    PostgreSQL_start_0 (node=trustcentric2, call=10, rc=-2, status=Timed
Out): unknown exec error
    PostgreSQL_stop_0 (node=trustcentric2, call=11, rc=1, status=complete):
unknown error

How do I get PostgreSQL running again?  I have attached an XML dump.

Thanks,
Jody
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20110218/31b7d797/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: primitive_stuck.xml
Type: text/xml
Size: 10322 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20110218/31b7d797/attachment-0003.xml>


More information about the Pacemaker mailing list