[ClusterLabs] "pacemakerd: recover properly from Corosync crash" fix

Klaus Wenninger kwenning at redhat.com
Thu Apr 18 11:43:14 EDT 2024


On Thu, Apr 18, 2024 at 5:07 PM NOLIBOS Christophe via Users <
users at clusterlabs.org> wrote:

> Classified as: {OPEN}
>
> I'm using RedHat 8.8 (4.18.0-477.21.1.el8_8.x86_64).
> When I kill Corosync, no new corosync process is created and pacemaker is
> in failure.
> The only solution is to restart the pacemaker service.
>
> [~]$ pcs status
> Error: unable to get cib
> [~]$
>
> [~]$systemctl status pacemaker
> ● pacemaker.service - Pacemaker High Availability Cluster Manager
>    Loaded: loaded (/usr/lib/systemd/system/pacemaker.service; enabled;
> vendor preset: disabled)
>    Active: active (running) since Thu 2024-04-18 13:16:04 UTC; 1h 43min ago
>      Docs: man:pacemakerd
>            https://clusterlabs.org/pacemaker/doc/
>  Main PID: 1324923 (pacemakerd)
>     Tasks: 91
>    Memory: 132.1M
>    CGroup: /system.slice/pacemaker.service
> ...
> Apr 18 14:59:02 - pacemakerd[1324923]:  crit: Could not connect to
> Corosync CFG: CS_ERR_LIBRARY
> Apr 18 14:59:03 - pacemakerd[1324923]:  crit: Could not connect to
> Corosync CFG: CS_ERR_LIBRARY
> Apr 18 14:59:04 - pacemakerd[1324923]:  crit: Could not connect to
> Corosync CFG: CS_ERR_LIBRARY
> Apr 18 14:59:05 - pacemakerd[1324923]:  crit: Could not connect to
> Corosync CFG: CS_ERR_LIBRARY
> Apr 18 14:59:06 - pacemakerd[1324923]:  crit: Could not connect to
> Corosync CFG: CS_ERR_LIBRARY
> Apr 18 14:59:07 - pacemakerd[1324923]:  crit: Could not connect to
> Corosync CFG: CS_ERR_LIBRARY
> Apr 18 14:59:08 - pacemakerd[1324923]:  crit: Could not connect to
> Corosync CFG: CS_ERR_LIBRARY
> Apr 18 14:59:09 - pacemakerd[1324923]:  crit: Could not connect to
> Corosync CFG: CS_ERR_LIBRARY
> Apr 18 14:59:10 - pacemakerd[1324923]:  crit: Could not connect to
> Corosync CFG: CS_ERR_LIBRARY
> Apr 18 14:59:11 - pacemakerd[1324923]:  crit: Could not connect to
> Corosync CFG: CS_ERR_LIBRARY
> [~]$
>
>
> Well if corosync isn't  there that this is to be expected and pacemaker
won't recover corosync.
Can you check what systemd thinks about corosync (status/journal).

Klaus

>
> {OPEN}
>
> -----Message d'origine-----
> De : Ken Gaillot <kgaillot at redhat.com>
> Envoyé : jeudi 18 avril 2024 16:40
> À : Cluster Labs - All topics related to open-source clustering welcomed <
> users at clusterlabs.org>
> Cc : NOLIBOS Christophe <christophe.nolibos at thalesgroup.com>
> Objet : Re: [ClusterLabs] "pacemakerd: recover properly from Corosync
> crash" fix
>
> What OS are you using? Does it use systemd?
>
> What does happen when you kill Corosync?
>
> On Thu, 2024-04-18 at 13:13 +0000, NOLIBOS Christophe via Users wrote:
> > Classified as: {OPEN}
> >
> > Dear All,
> >
> > I have a question about the "pacemakerd: recover properly from
> > Corosync crash" fix implemented in version 2.1.2.
> > I have observed the issue when testing pacemaker version 2.0.5, just
> > by killing the ‘corosync’ process: Corosync was not recovered.
> >
> > I am using now pacemaker version 2.1.5-8.
> > Doing the same test, I have the same result: Corosync is still not
> > recovered.
> >
> > Please confirm the "pacemakerd: recover properly from Corosync crash"
> > fix implemented in version 2.1.2 covers this scenario.
> > If it is, did I miss something in the configuration of my cluster?
> >
> > Best Regard.
> >
> > Christophe.
> >
> >
> >
> > {OPEN}
> > _______________________________________________
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> >
> > ClusterLabs home: https://www.clusterlabs.org/
> --
> Ken Gaillot <kgaillot at redhat.com>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20240418/3ceffc4e/attachment-0001.htm>


More information about the Users mailing list