[ClusterLabs] Q: monitor and probe result codes and consequences
Ulrich Windl
Ulrich.Windl at rz.uni-regensburg.de
Thu May 12 07:56:47 UTC 2016
Hi!
I have a question regarding an RA written by myself and pacemaker 1.1.12-f47ea56 (SLES11 SP4):
During "probe" all resources' "monitor" actions are executed (regardless of any ordering constraints). Therefore my RA considers a parameter as invalid ("file does not exist") (the file will be provided once some supplying resource is up) and returns rc=2.
OK, this may not be optimal, but pacemaker makes it worse: It does not repeat the probe once the resource would start, but keeps the state, preventing a resource start:
primitive_monitor_0 on h05 'invalid parameter' (2): call=73, status=complete, exit-reason='none', last-rc-change='Wed May 11 17:03:39 2016', queued=0ms, exec=82ms
So you would say that monitor may only return "success" or "not running", but I feel the RA should detect the condition that the resource could not run at all at the present state.
Shouldn't pacemaker reprobe resources before it tries to start them?
Before my RA had passed all the ocf-tester checks, so this situation is hard to test (unless you have a test cluster you can restart any time).
(After manual resource cleanup the resource started as usual)
My monitor uses the following logic:
---
monitor|status)
if validate; then
set_variables
check_resource || exit $OCF_NOT_RUNNING
status=$OCF_SUCCESS
else # cannot check status with invalid parameters
status=$?
fi
exit $status
;;
---
Should I mess with ocf_is_probe?
Regards,
Ulrich
More information about the Users
mailing list