[ClusterLabs] Querying failed rersource operations from the CIB
Ulrich.Windl at rz.uni-regensburg.de
Mon Aug 12 05:15:17 EDT 2019
Back in December 2011 I had written a script to retrieve all failed resource operations by using "cibadmin -Q -o lrm_resources" as data base. I was querying lrm_rsc_op for op-status != 0.
In a newer release this does not seems to work anymore.
I see resource IDs ending with "_last_0", "_monitor_60000", and "_last_failure_0", but even in the "_last_failure_0" the op-status is "0" (rc-code="7").
Is this some bug, or is it a feature? That is: When will op-status be != 0?
crm_mon still reports a resource failure like this:
Failed Resource Actions:
* prm_nfs_server_monitor_60000 on h11 'not running' (7): call=738, status=complete, exitreason='',
last-rc-change='Mon Aug 12 04:52:23 2019', queued=0ms, exec=0ms
(it seems the nfs server monitor does this under load in SLES12 SP4, and I wonder where to look for the reason)
BTW: "lrm_resources" is not documented, and the structure seemes to change. Can I restrict the output to LRM data?
More information about the Users