[ClusterLabs] Antw: OCF Return codes OCF_NOT_RUNNING

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Wed Jul 11 07:44:47 EDT 2018


>>> Ian Underhill <ianpunderhill at gmail.com> schrieb am 11.07.2018 um 13:27 in
Nachricht
<CAGu+cYgg4fFjDwXNK9JAB0gLWvVMBn2aaA4+dL6SAV7nmm2R=w at mail.gmail.com>:
> im trying to understand the behaviour of pacemaker when a resource monitor
> returns OCF_NOT_RUNNING instead of OCF_ERR_GENERIC, and does pacemaker
> really care.
> 
> The documentation states that a return code OCF_NOT_RUNNING from a monitor
> will not result in a stop being called on that resource, as it believes the
> node is still clean.
> 
> https://www.clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Pacemaker 
> _Explained/s-ocf-return-codes.html
> 
> This makes sense, however in practice is not what happens (unless im doing
> something wrong :) )
> 
> When my resource returns OCF_NOT_RUNNING for a monitor call (after a start
> has been performed) a stop is called.

Well: it depends: If your start was successful, pacemaker believes the resource is running. If the monitor says it's stopped, pacemaker seems to try a "clean stop" by calling the stop method (possibly before trying to start it again). Am I right?

> 
> if I have a resource threshold set >1,  i get start->monitor->stop cycle
> until the threshold is consumed

Then either your start is broken, or your monitor is broken. Try to validate your RA using ocf-tester before using it.

Regards,
Ulrich

> 
> /Ian.







More information about the Users mailing list