[Pacemaker] Time-based resource stickiness not working cleanly

Wed Jun 27 16:06:39 EDT 2012

I created a simple IPaddr2 resource for this testing.

1. I can confirm that this resource (with just a location preference constraint and nothing else) does not get restarted when a node reboots. So must be something with the clone/group resources. Not sure how to debug right now.
2. I have a time-based rule which prevents any resource in the cluster from going back to the preferred node between 7am and 3pm, monday through friday.
	I can confirm that in my setting this does not work reliably.
		The resource fails over to the non-preferred node, but stays there even if the current time is outside of the core business hours.
		If I manually stop and start the resource, then it respects the location preference and goes to the preferred node.
		I have the cluster-recheck-interval set to 1min.

Thanks,
Prakash

On Jun 27, 2012, at 3:01 PM, Velayutham, Prakash wrote:

> I am all for testing, but looks like our database person wants this completed now. I will test this in our dev. environment soon.
> 
> Thanks,
> Prakash
> 
> On Jun 27, 2012, at 2:44 PM, Phil Frost wrote:
> 
>> On 06/27/2012 02:33 PM, Velayutham, Prakash wrote:
>>> and the cluster works fine, except that when the fenced (STONITHed) node comes back up and joins the cluster, all resources (including the one that is running in its preferred location) gets restarted.
>>> 
>>> This is annoying and I am trying to find out why this is. I started another thread for that exact issue this morning and I have shared the relevant portions of my CIB there. Please let me know if you see anything there that could cause this.
>> 
>> Your best approach is probably to simplify the configuration as much as you can. Eliminate any resources or constraints that aren't necessary to demonstrate the problem, and use ocf:pacemaker:Dummy and ocf:pacemaker:Stateful resources instead of the real ones. In simplifying the configuration, you may discover a mistake. If not, you will have something that doesn't take so much time to consider, so more people will be willing to help you.
>> 
>> Also, play with crm_simulate.
>> 
>> Also see my previous response about a similar problem I had, and a suspected bug that may be affecting you. If you can independently confirm the bug, after working out a minimal test case as above, it's more likely someone will fix it.
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org