[ClusterLabs] Antw: Q: monitor and probe result codes and consequences

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Thu May 12 07:23:42 EDT 2016


>>> Ulrich Windl schrieb am 12.05.2016 um 09:56 in Nachricht <5734373F.4DE : 161 :
60728>:
> Hi!
> 
> I have a question regarding an RA written by myself and pacemaker 
> 1.1.12-f47ea56 (SLES11 SP4):
> 
> During "probe" all resources' "monitor" actions are executed (regardless of 
> any ordering constraints). Therefore my RA considers a parameter as invalid 
> ("file does not exist") (the file will be provided once some supplying 
> resource is up) and returns rc=2.
> OK, this may not be optimal, but pacemaker makes it worse: It does not 
> repeat the probe once the resource would start, but keeps the state, 
> preventing a resource start:
> 
>  primitive_monitor_0 on h05 'invalid parameter' (2): call=73, 
> status=complete, exit-reason='none', last-rc-change='Wed May 11 17:03:39 
> 2016', queued=0ms, exec=82ms
[...]

After fixing the problem in the RA (hopefully), I found out two more things:

1) "crm resource reprobe res node" (crm_resource -C -H node) does not actually probe:
crm_resource[14865]:    error: unpack_rsc_op: Preventing res_inst from re-starting on h01: operation monitor failed 'invalid parameter' (2)
attrd[4828]:   notice: attrd_perform_update: Sent delete 19: node=h01, attr=fail-count-res_inst, id=<n/a>, set=(null), section=status

I think this is a bug! I don't know whether these lines are related to the problem:
crmd[4830]:   notice: do_lrm_invoke: Not creating resource for a delete event: (null)

2) While the probe was not actually started, the error condition was refreshed:
res_inst_monitor_0 on h01 'invalid parameter' (2): call=73, status=complete, exit-reason='none', last-rc-change='Thu May 12 11:25:21 2016', queued=0ms, exec=41ms

I think this is another bug!

3) ;-) I wonder when crm's "resource reprobe"  is actually a resource cleanup, why can't I specify the resource to reprobe? (Just add "-r resource" to the commandline of crm_resource).
---
# crm resource help reprobe
Probe for resources not started by the CRM

Probe for resources not started by the CRM.

Usage:

reprobe [<node>]
---

4) When I manually entered "crm_resource -C -r res_inst -N h01", it succeeded! However conflicting messages were output:
crm_resource[7542]:    error: unpack_rsc_op: Preventing res_inst from re-starting on h01: operation monitor failed 'invalid parameter' (2)
crmd[4830]:   notice: process_lrm_event: Operation res_inst_monitor_0: not running (node=h01, call=187, rc=7, cib-update=137, confirmed=true)
(resource starts)

Regards,
Ulrich







More information about the Users mailing list