[ClusterLabs] Querying failed rersource operations from the CIB

Ken Gaillot kgaillot at redhat.com
Mon Aug 12 20:24:08 EDT 2019


On Mon, 2019-08-12 at 11:15 +0200, Ulrich Windl wrote:
> Hi!
> 
> Back in December 2011 I had written a script to retrieve all failed
> resource operations by using "cibadmin -Q -o  lrm_resources" as data
> base. I was querying lrm_rsc_op for op-status != 0.
> In a newer release this does not seems to work anymore.
> 
> I see resource IDs ending with "_last_0", "_monitor_60000", and
> "_last_failure_0", but even in the "_last_failure_0" the op-status is
> "0" (rc-code="7").
> Is this some bug, or is it a feature? That is: When will op-status be
> != 0?

rc-code is the result of the action itself (i.e. the resource agent),
whereas op-status is the result of pacemaker's attempt to execute the
agent.

If pacemaker was able to successfully initiate the resource agent and
get a reply back, then op-status will be 0, regardless of the rc-code
reported by the agent.

op-status will be nonzero when it couldn't get a result from the agent
-- the agent is not installed on the node, the agent timed out, the
connection to the local executor or Pacemaker Remote was lost, the
action was requested while the node was shutting down, etc.

There's also a special op-status (193) that indicates an action is
pending (i.e. it has been initiated and we're waiting for it to
complete). This is only seen when record-pending is true.

> crm_mon still reports a resource failure like this:
> Failed Resource Actions:
> * prm_nfs_server_monitor_60000 on h11 'not running' (7): call=738,
> status=complete, exitreason='',
>     last-rc-change='Mon Aug 12 04:52:23 2019', queued=0ms, exec=0ms
> 
> (it seems the nfs server monitor does this under load in SLES12 SP4,
> and I wonder where to look for the reason)
> BTW: "lrm_resources" is not documented, and the structure seemes to
> change. Can I restrict the output to LRM data?

One possibility is to run crm_mon with --as-xml and parse the failed
actions from that output. The schema is distributed as crm_mon.rng.

> Regards,
> Ulrich
-- 
Ken Gaillot <kgaillot at redhat.com>



More information about the Users mailing list