[ClusterLabs] service flap as nodes join and leave

Christopher Harvey cwh at eml.cc
Mon Apr 18 23:33:11 EDT 2016


On Thu, Apr 14, 2016, at 11:12 AM, Ken Gaillot wrote:
> On 04/14/2016 09:33 AM, Christopher Harvey wrote:
> > MsgBB-Active is a dummy resource that simply returns OCF_SUCCESS on
> > every operation and logs to a file.
> 
> That's a common mistake, and will confuse the cluster. The cluster
> checks the status of resources both where they're supposed to be running
> and where they're not. If status always returns success, the cluster
> won't try to start it where it should,, and will continuously try to
> stop it elsewhere, because it thinks it's already running everywhere.
> 
> It's essential that an RA distinguish between running
> (OCF_SUCCESS/OCF_RUNNING_MASTER), cleanly not running (OCF_NOT_RUNNING),
> and unknown/failed (OCF_ERR_*/OCF_FAILED_MASTER).

Solved. Thanks!

> See pacemaker's Dummy agent as an example/template:
> 
> https://github.com/ClusterLabs/pacemaker/blob/master/extra/resources/Dummy
> 
> It touches a temporary file to know whether it is "running" or not.
> 
> ocf-shellfuncs has a ha_pseudo_resource() function that does the same
> thing. See the ocf:heartbeat:Delay agent for example usage.
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org




More information about the Users mailing list