[ClusterLabs] error: The cib process (17858) exited: Key has expired (127)

Rens Houben rhouben at systemec.nl
Fri Mar 24 16:06:24 UTC 2017


I activated debug=cib, and retried.

New log file up at http://proteus.systemec.nl/~shadur/pacemaker/pacemaker_2.log.txt ; unfortunately, while that *is* more information I'm not seeing anything that looks like it could be the cause, although it shouldn't be reading any config files yet because there shouldn't be any *to* read...

As to the misleading error message, it gets weirder: I grabbed a copy of the source code via apt-get source, and the phrase 'key has expired' does not occur anywhere in any file according to find ./ -type f -exec grep -il 'key has expired' {} \; so I have absolutely NO idea where it's coming from...

--
Rens Houben
Systemec Internet Services

SYSTEMEC BV


Marinus Dammeweg 25, 5928 PW Venlo
Postbus 3290, 5902 RG Venlo
Industrienummer: 6817
Nederland

T: 077-3967572 (Support)
K.V.K. nummer: 12027782 (Venlo)


[Systemec Datacenter Venlo & Nettetal]<https://www.systemec.nl>


[Systemec Helpdesk]<https://support.systemec.nl>  Helpdesk<https://support.systemec.nl>


[Aanmelden nieuwsbrief]<https://www.systemec.nl/nl/nieuwsbrief>  Aanmelden nieuwsbrief<https://www.systemec.nl/nl/nieuwsbrief>


Volg ons op: [Systemec Twitter] <https://twitter.com/systemec>  [Systemec Facebook] <https://www.facebook.com/systemecbv>  [Systemec Linkedin] <http://www.linkedin.com/company/systemec-b.v.>  [Systemec Youtube] <http://www.youtube.com/user/systemec1>


________________________________________
Van: Ken Gaillot <kgaillot at redhat.com>
Verzonden: vrijdag 24 maart 2017 16:49
Aan: users at clusterlabs.org
Onderwerp: Re: [ClusterLabs] error: The cib process (17858) exited: Key has expired (127)

On 03/24/2017 08:06 AM, Rens Houben wrote:
> I recently upgraded a two-node cluster (named 'castor' and 'pollux'
> because I should not be allowed to think up computer names before I've
> had my morning caffeine) from Debian wheezy to Jessie after the
> backports for corosync and pacemaker finally made it in. However, one of
> the two servers failed to start correctly for no really obvious reason.
>
> Given as how it'd been years since I last set them up  and had forgotten
> pretty much everything about it in the interim I decided to purge
> corosync and pacemaker on both systems and run with clean installs instead.
>
> This worked on pollux, but not on castor. Even after going pack,
> re-purging, removing everything legacy in /var/lib/heartbeat and
> emptying both directories, castor still refuses to bring up pacemaker.
>
>
> I put the full log of a start attempt up at
> http://proteus.systemec.nl/~shadur/pacemaker/pacemaker.log.txt
> <http://proteus.systemec.nl/%7Eshadur/pacemaker/pacemaker.log.txt>, but
> this is the excerpt that I /think/ is causing the failure:
>
> Mar 24 13:59:05 [25495] castor pacemakerd:    error: pcmk_child_exit:The
> cib process (25502) exited: Key has expired (127)
> Mar 24 13:59:05 [25495] castor pacemakerd:   notice:
> pcmk_process_exit:Respawning failed child process: cib
>
> I don't see any entries from cib in the log that suggest anything's
> going wrong, though, and I'm running out of ideas on where to look next.

The "Key has expired" message is misleading. (Pacemaker really needs an
overhaul of the exit codes it can return, so these messages can be
reliable, but there are always more important things to take care of ...)

Pacemaker is getting 127 as the exit status of cib, and interpreting
that as a standard system error number, but it probably isn't one. I
don't actually see any way that the cib can return 127, so I'm not sure
what that might indicate.

In any case, the cib is mysteriously dying whenever it tries to start,
apparently without logging why or dumping core. (Do you have cores
disabled at the OS level?)

> Does anyone have any suggestions as to how to coax more information out
> of the processes and into the log files so I'll have a clue to work with?

Try it again with PCMK_debug=cib in /etc/default/pacemaker. That should
give more log messages.

>
> Regards,
>
> --
> Rens Houben
> Systemec Internet Services
>
> SYSTEMEC BV
>
> Marinus Dammeweg 25, 5928 PW Venlo
> Postbus 3290, 5902 RG Venlo
> Industrienummer: 6817
> Nederland
>
> T: 077-3967572 (Support)
> K.V.K. nummer: 12027782 (Venlo)
>
> Systemec Datacenter Venlo & Nettetal <https://www.systemec.nl>
>
> Systemec Helpdesk <https://support.systemec.nl>  Helpdesk
> <https://support.systemec.nl>
>
> Aanmelden nieuwsbrief <https://www.systemec.nl/nl/nieuwsbrief>
>  Aanmelden nieuwsbrief <https://www.systemec.nl/nl/nieuwsbrief>
>
> Volg ons op: Systemec Twitter <https://twitter.com/systemec> Systemec
> Facebook <https://www.facebook.com/systemecbv> Systemec Linkedin
> <http://www.linkedin.com/company/systemec-b.v.> Systemec Youtube
> <http://www.youtube.com/user/systemec1>

_______________________________________________
Users mailing list: Users at clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20170324/feda196e/attachment-0002.html>


More information about the Users mailing list