[ClusterLabs] Master-Slaver resource Restarted after configuration change

Wed Jun 29 16:00:33 EDT 2016

On 06/29/2016 01:35 PM, Ilia Sokolinski wrote:
> 
>>
>> I'm not sure there's a way to do this.
>>
>> If a (non-reloadable) parameter changes, the entire clone does need a
>> restart, so the cluster will want all instances to be stopped, before
>> proceeding to start them all again.
>>
>> Your desired behavior couldn't be the default, because not all services
>> would be able to function correctly with a running master using
>> different configuration options than running slaves. In fact, I think it
>> would be rare; consider a typical option for a TCP port -- changing the
>> port in only the slaves would break communication with the master and
>> potentially lead to data inconsistency.
>>
>> Can you give an example of an option that could be handled this way
>> without causing problems?
>>
>> Reload could be a way around this, but not in the way you suggest. If
>> your service really does need to restart after the option change, then
>> reload is not appropriate. However, if you can approach the problem on
>> the application side, and make it able to accept the change without
>> restarting, then you could implement it as a reload in the agent.
>>
> 
> Ken,
> 
> I see what you are saying.
> The parameter we are changing is the docker image version, so it is not possible to Reload it without a restart.
> 
> Couple of questions:
> What is reloadable vs non-reloadable parameter? Is it the same as unique=“0” vs unique=“1”?
> We currently set  unique=“0”.

Yes, the cluster considers any parameter with unique=0 as reloadable, if
the resource agent supports the reload action.

> When doing repeated experiments, I see that sometimes both Master and Slave are Reload-ed, but sometimes one of them is Restart-ed.
> 
> Why is that?

Good question. I would expect all or no instances of the same clone to
be reloaded.

An otherwise reloadable change may get a restart if there is also a
nonreloadable parameter changing at the same time. Also, if the
reloadable resource is ordered after another resource that is being
restarted, it will get a restart.

As an aside, I'm not happy with the current implementation of reload.
Using "unique" to determine reloadability was not a good choice; it
should be a separate attribute. More importantly, there's a fundamental
misunderstanding between pacemaker's use of reload and how most resource
agent writers interpret it -- pacemaker calls it when a resource
parameter in the pacemaker configuration changes, but most RAs use it
for a service's native reload of its own configuration file. Those two
use cases need to be separated.

> I looked at the source code allocate.c:check_action_definition(), and it seems that there is a meta parameter
> called “isolation” which affects on Reload vs Restart decision.
> 
> I can’t find any documentation about this “isolation” meta parameter.
> Do you know what is is intended for?

That is a great feature that, unfortunately, completely lacks
documentation and testing. It's a way to run cluster-managed services
inside a Docker container. Documentation/testing are on the to-do list,
but it's a long list ...

> Thanks a lot
> 
> Ilia