[Pacemaker] monitor on disabled nodes

Lars Marowsky-Bree lmb at suse.com
Thu Sep 19 03:17:43 EDT 2013


On 2013-09-18T12:20:08, Radoslaw Garbacz <radoslaw.garbacz at xtremedatainc.com> wrote:

> Sorry for not being specific.
> 
> The agent is meant to run only on a specific node (the head), and by
> constraints is disabled on all other nodes.
> 
> 'pcs constraint' reports:
> Location Constraints:
>   Resource: dbx_nfs_head
>     Enabled on: ip-10-138-14-225
>     Disabled on: ip-10-151-14-34 ip-10-238-146-54

Ah, I wasn't aware that pcs had introduced the enabled/disabled
terminology for location constraints. That may be misleading, because
location constraints don't actually "disable" an agent from running
somewhere - but ban the resource from being hosted.

That means the cluster will still probe for it (the "monitor" interval=0
call you see) to make sure that, if it is found active, it can stop it
to bring the system into compliance with the configuration.

> 'pcs status' reports:
> Failed actions:
>     dbx_nfs_head_monitor_0 (node=ip-10-238-146-54, call=1127, rc=6,
> status=complete): not configured
>     dbx_nfs_head_monitor_0 (node=ip-10-151-14-34, call=1127, rc=6,
> status=complete): not configured

Probably returning "not configured" is wrong here.

It should check if the service is active and then either return
OCF_SUCCESS if it's healthy or OCF_ERR_GENERIC if it is in a failed
state; or OCF_NOT_RUNNING if the agent can verify that it is indeed
cleanly stopped.

If it's not found (and thus can't be running), for a probe
OCF_ERR_INSTALLED is more appropriate - that means an issue with the
local node (such as binaries gone etc).

OCF_ERR_CONFIGURED means "the cluster definition of this service is
wrong" and implies the service can't be started anywhere.

Please refer to the
http://www.linux-ha.org/doc/dev-guides/ra-dev-guide.html

> keep it away from all other nodes, but as far as I understand, the
> pacemaker needs to check if it is running, so I would like to recognize
> this situation and skip the check on all nodes except the head.

No. Just skipping it is mostly likely wrong as well; you just need to
return the correct state.


Regards,
    Lars

-- 
Architect Storage/HA
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde





More information about the Pacemaker mailing list