[ClusterLabs] Antw: Q: monitor and probe result codes and consequences
Ulrich Windl
Ulrich.Windl at rz.uni-regensburg.de
Thu May 12 11:23:42 UTC 2016
>>> Ulrich Windl schrieb am 12.05.2016 um 09:56 in Nachricht <5734373F.4DE : 161 :
60728>:
> Hi!
>
> I have a question regarding an RA written by myself and pacemaker
> 1.1.12-f47ea56 (SLES11 SP4):
>
> During "probe" all resources' "monitor" actions are executed (regardless of
> any ordering constraints). Therefore my RA considers a parameter as invalid
> ("file does not exist") (the file will be provided once some supplying
> resource is up) and returns rc=2.
> OK, this may not be optimal, but pacemaker makes it worse: It does not
> repeat the probe once the resource would start, but keeps the state,
> preventing a resource start:
>
> primitive_monitor_0 on h05 'invalid parameter' (2): call=73,
> status=complete, exit-reason='none', last-rc-change='Wed May 11 17:03:39
> 2016', queued=0ms, exec=82ms
[...]
After fixing the problem in the RA (hopefully), I found out two more things:
1) "crm resource reprobe res node" (crm_resource -C -H node) does not actually probe:
crm_resource[14865]: error: unpack_rsc_op: Preventing res_inst from re-starting on h01: operation monitor failed 'invalid parameter' (2)
attrd[4828]: notice: attrd_perform_update: Sent delete 19: node=h01, attr=fail-count-res_inst, id=<n/a>, set=(null), section=status
I think this is a bug! I don't know whether these lines are related to the problem:
crmd[4830]: notice: do_lrm_invoke: Not creating resource for a delete event: (null)
2) While the probe was not actually started, the error condition was refreshed:
res_inst_monitor_0 on h01 'invalid parameter' (2): call=73, status=complete, exit-reason='none', last-rc-change='Thu May 12 11:25:21 2016', queued=0ms, exec=41ms
I think this is another bug!
3) ;-) I wonder when crm's "resource reprobe" is actually a resource cleanup, why can't I specify the resource to reprobe? (Just add "-r resource" to the commandline of crm_resource).
---
# crm resource help reprobe
Probe for resources not started by the CRM
Probe for resources not started by the CRM.
Usage:
reprobe [<node>]
---
4) When I manually entered "crm_resource -C -r res_inst -N h01", it succeeded! However conflicting messages were output:
crm_resource[7542]: error: unpack_rsc_op: Preventing res_inst from re-starting on h01: operation monitor failed 'invalid parameter' (2)
crmd[4830]: notice: process_lrm_event: Operation res_inst_monitor_0: not running (node=h01, call=187, rc=7, cib-update=137, confirmed=true)
(resource starts)
Regards,
Ulrich
More information about the Users
mailing list