[ClusterLabs] nova-compute_monitor_10000 on 'node-xxx ' not running

Fri Jun 25 02:41:32 EDT 2021

1. deleted recorded failures.
crm_failcount -V -D -r nova-compute -N remote-db8-ca-3a-69-50-34 -n monitor
-I 10000

2. cleanup resource status
crm resource cleanup nova-compute remote-db8-ca-3a-69-50-34 force

Problem resolved.

 But I don't know why these failed records are still there after the
resource is running.

On Wed, Jun 23, 2021 at 5:13 PM luckydog xf <luckydogxf at gmail.com> wrote:

> hello, guys,
>
> I built  an openstack cluster with  pacemaker, all nova-compute nodes are
> running. Yet
> `crm_mon -1r` shows only a nova-compute service is wrong
> ---
> Failed Actions:
> * nova-compute_monitor_10000 on remote-db8-ca-3a-69-50-34 'not running'
> (7): call=719373, status=complete, exitreason='none',
>     last-rc-change='Mon Mar  1 20:27:35 2021', queued=0ms, exec=0ms
>
> ---
> It's a false alarm, nova-compute is running on that node, and started by
> pacemaker-remote.
>
> # /var/log/pacemaker.log
> attrd[4085]:   notice: Update error (unknown peer uuid, retry will be
> attempted once uuid is discovered).
>
> So what's the root cause? My pacemaker is 1.1.16.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20210625/f47f099b/attachment.htm>