[Pacemaker] crmd restart due to internal error - pacemaker 1.1.8

pavan tc pavan.tc at gmail.com
Thu May 9 23:44:53 EDT 2013


On Fri, May 10, 2013 at 6:21 AM, Andrew Beekhof <andrew at beekhof.net> wrote:

>
> On 08/05/2013, at 9:16 PM, pavan tc <pavan.tc at gmail.com> wrote:
>
>
Hi Andrew,

Thanks much for looking into this. I have some queries inline.


> > Hi,
> >
> > I have a two-node cluster with STONITH disabled.
>
> Thats not a good idea.
>

Ok. I'll try and configure stonith.

> I am still running with the pcmk plugin as opposed to the recommended
> CMAN plugin.
>
> On rhel6?
>

Yes.


>
> >
> > With 1.1.8, I see some messages (appended to this mail) once in a while.
> I do not understand some keywords here - There is a "Leave" action. I am
> not sure what that is.
>
> It means the cluster is not going to change the state of the resource.
>

Why did the cluster execute the "Leave" action at this point? Is there some
other error that triggers this? Or is it a benign message?


> > And, there is a CIB update failure that leads to a RECOVER action. There
> is a message that says the RECOVER action is not supported. Finally this
> leads to a stop and start of my resource.
>
> Well, and also Pacemaker's crmd process.
> My guess... the node is overloaded which is causing the cib queries to
> time out.
>
>
Is there a cib query timeout value that I can set? I was earlier getting
the TOTEM timeout.
So, I set the token to a larger value (5 seconds) in corosync.conf and
things were much better.
But now, I have started hitting this problem.

Thanks,
Pavan

> I can copy the "crm configure show" output, but nothing special there.
> >
> > Thanks much.
> > Pavan
> >
> > PS: The resource vha-bcd94724-3ec0-4a8d-8951-9d27be3a6acb is stale. The
> underlying device that represents this resource has been removed. However,
> the resource is still part of the CIB. All errors related to that resource
> can be ignored. But can this cause a node to be stopped/fenced?
>
> Not if fencing is disabled.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130510/a7078c25/attachment-0003.html>


More information about the Pacemaker mailing list