[ClusterLabs] nova-compute_monitor_10000 on 'node-xxx ' not running

luckydog xf luckydogxf at gmail.com
Sun Jun 27 23:05:16 EDT 2021


Currently, Pacemaker supports the "failure-timeout" resource
meta-attribute, which will automatically clear a resource's failure history
once it has no new failures in that much time.
Yea, find this explanation in Redhat docs.
Thanks.

On Fri, Jun 25, 2021 at 10:07 PM <kgaillot at redhat.com> wrote:

> On Fri, 2021-06-25 at 14:41 +0800, luckydog xf wrote:
> > 1. deleted recorded failures.
> > crm_failcount -V -D -r nova-compute -N remote-db8-ca-3a-69-50-34 -n
> > monitor -I 10000
> >
> > 2. cleanup resource status
> > crm resource cleanup nova-compute remote-db8-ca-3a-69-50-34 force
> >
> > Problem resolved.
> >
> >  But I don't know why these failed records are still there after the
> > resource is running.
>
> The failure displays are a history. The most recent failure is shown
> until the administrator can view and investigate, then run cleanup
> manually.
>
> There is also a failure-timeout resource option to have failures get
> cleaned up automatically after a certain amount of time with no
> failures.
>
> > On Wed, Jun 23, 2021 at 5:13 PM luckydog xf <luckydogxf at gmail.com>
> > wrote:
> > > hello, guys,
> > >
> > > I built  an openstack cluster with  pacemaker, all nova-compute
> > > nodes are running. Yet
> > > `crm_mon -1r` shows only a nova-compute service is wrong
> > > ---
> > > Failed Actions:
> > > * nova-compute_monitor_10000 on remote-db8-ca-3a-69-50-34 'not
> > > running' (7): call=719373, status=complete, exitreason='none',
> > >     last-rc-change='Mon Mar  1 20:27:35 2021', queued=0ms, exec=0ms
> > >
> > > ---
> > > It's a false alarm, nova-compute is running on that node, and
> > > started by pacemaker-remote.
> > >
> > > # /var/log/pacemaker.log
> > > attrd[4085]:   notice: Update error (unknown peer uuid, retry will
> > > be attempted once uuid is discovered).
> > >
> > > So what's the root cause? My pacemaker is 1.1.16.
> >
> > _______________________________________________
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> >
> > ClusterLabs home: https://www.clusterlabs.org/
> --
> Ken Gaillot <kgaillot at redhat.com>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20210628/1bce3615/attachment.htm>


More information about the Users mailing list