[ClusterLabs] Strange behaviour of group resource

Tue Jul 30 10:33:23 EDT 2019

On Tue, 2019-07-30 at 16:26 +0530, Dileep V Nair wrote:
> Thanks Ken for the response. I see below errors. Not sure why it says
> target: 7 vs. rc: 0. Does that mean that pacemaker expect the
> resource to be stopped and since it is running, it is taking an
> action ?
> 
> Jul 30 10:08:59 dntstdb2s0703 cib[90848]: warning: A-Sync reply to
> crmd failed: No message of desired type
> 
> Jul 30 10:09:04 dntstdb2s0703 crmd[90853]: warning: Action 16 (fs-
> sapdata4_monitor_0) on dntstdb2s0703 failed (target: 7 vs. rc: 0):
> Error
> Jul 30 10:09:04 dntstdb2s0703 crmd[90853]: notice: Transition 1445
> aborted by operation fs-sapdata4_monitor_0 'modify' on dntstdb2s0703:
> Event failed

These actually aren't errors, and they're expected after a clean-up. I
recently merged a change to make the message more accurate. As of the
next release, it will look like:

notice: Transition 1445 action 5 (fs-sapdata4_monitor_0 on dntstdb2s0703): expected 'not running' but got 'ok'

Cleaning up a resource involves clearing its history. That makes the
cluster expect that it is stopped. The cluster then runs probes to find
out the actual status, and if the probe finds it running, the above
situation happens.

So, that's not causing the restarts. An actual failure that could cause
restarts would have a similar message, but the rc would be something
other than 0 or 7.

> Jul 30 10:09:04 dntstdb2s0703 crmd[90853]: warning: Action 16 (fs-
> sapdata4_monitor_0) on dntstdb2s0703 failed (target: 7 vs. rc: 0):
> Error
> Jul 30 10:09:04 dntstdb2s0703 stonith-ng[90849]: notice: On loss of
> CCM Quorum: Ignore
> Jul 30 10:09:04 dntstdb2s0703 crmd[90853]: notice: Result of probe
> operation for fs-saptmp3 on dntstdb2s0703: 0 (ok)
> Jul 30 10:09:04 dntstdb2s0703 crmd[90853]: warning: Action 19 (fs-
> saptmp3_monitor_0) on dntstdb2s0703 failed (target: 7 vs. rc: 0):
> Error
> Jul 30 10:09:04 dntstdb2s0703 crmd[90853]: warning: Action 19 (fs-
> saptmp3_monitor_0) on dntstdb2s0703 failed (target: 7 vs. rc: 0):
> Error
> 
> Thanks & Regards
> 
> Dileep Nair
> Squad Lead - SAP Base 
> Togaf Certified Enterprise Architect
> IBM Services for Managed Applications
> +91 98450 22258 Mobile
> dilenair at in.ibm.com
> 
> IBM Services
> 
> 
> Ken Gaillot ---07/30/2019 12:47:52 AM---On Thu, 2019-07-25 at 20:51
> +0530, Dileep V Nair wrote: > Hi,
> 
> From: Ken Gaillot <kgaillot at redhat.com>
> To: Cluster Labs - All topics related to open-source clustering
> welcomed <users at clusterlabs.org>
> Date: 07/30/2019 12:47 AM
> Subject: [EXTERNAL] Re: [ClusterLabs] Strange behaviour of group
> resource
> Sent by: "Users" <users-bounces at clusterlabs.org>
> 
> 
> 
> On Thu, 2019-07-25 at 20:51 +0530, Dileep V Nair wrote:
> > Hi,
> > 
> > I have around 10 filesystems in a group. When I do a crm resource
> > refresh, the filesystems are unmounted and remounted, starting from
> > the fourth resource in the group. Any idea what could be going on,
> is
> > it expected ?
> 
> No, it sounds like some of the reprobes are failing. The logs may
> have
> more info. Each filesystem will have a probe like RSCNAME_monitor_0
> on
> each node.
> 
> > 
> > Thanks & Regards
> > 
> > Dileep Nair
> > Squad Lead - SAP Base 
> > Togaf Certified Enterprise Architect
> > IBM Services for Managed Applications
> > +91 98450 22258 Mobile
> > dilenair at in.ibm.com
> > 
> > IBM Services
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot <kgaillot at redhat.com>