[ClusterLabs] Master-Slaver resource Restarted after configuration change
kgaillot at redhat.com
Wed Jun 29 16:00:33 EDT 2016
On 06/29/2016 01:35 PM, Ilia Sokolinski wrote:
>> I'm not sure there's a way to do this.
>> If a (non-reloadable) parameter changes, the entire clone does need a
>> restart, so the cluster will want all instances to be stopped, before
>> proceeding to start them all again.
>> Your desired behavior couldn't be the default, because not all services
>> would be able to function correctly with a running master using
>> different configuration options than running slaves. In fact, I think it
>> would be rare; consider a typical option for a TCP port -- changing the
>> port in only the slaves would break communication with the master and
>> potentially lead to data inconsistency.
>> Can you give an example of an option that could be handled this way
>> without causing problems?
>> Reload could be a way around this, but not in the way you suggest. If
>> your service really does need to restart after the option change, then
>> reload is not appropriate. However, if you can approach the problem on
>> the application side, and make it able to accept the change without
>> restarting, then you could implement it as a reload in the agent.
> I see what you are saying.
> The parameter we are changing is the docker image version, so it is not possible to Reload it without a restart.
> Couple of questions:
> What is reloadable vs non-reloadable parameter? Is it the same as unique=“0” vs unique=“1”?
> We currently set unique=“0”.
Yes, the cluster considers any parameter with unique=0 as reloadable, if
the resource agent supports the reload action.
> When doing repeated experiments, I see that sometimes both Master and Slave are Reload-ed, but sometimes one of them is Restart-ed.
> Why is that?
Good question. I would expect all or no instances of the same clone to
An otherwise reloadable change may get a restart if there is also a
nonreloadable parameter changing at the same time. Also, if the
reloadable resource is ordered after another resource that is being
restarted, it will get a restart.
As an aside, I'm not happy with the current implementation of reload.
Using "unique" to determine reloadability was not a good choice; it
should be a separate attribute. More importantly, there's a fundamental
misunderstanding between pacemaker's use of reload and how most resource
agent writers interpret it -- pacemaker calls it when a resource
parameter in the pacemaker configuration changes, but most RAs use it
for a service's native reload of its own configuration file. Those two
use cases need to be separated.
> I looked at the source code allocate.c:check_action_definition(), and it seems that there is a meta parameter
> called “isolation” which affects on Reload vs Restart decision.
> I can’t find any documentation about this “isolation” meta parameter.
> Do you know what is is intended for?
That is a great feature that, unfortunately, completely lacks
documentation and testing. It's a way to run cluster-managed services
inside a Docker container. Documentation/testing are on the to-do list,
but it's a long list ...
> Thanks a lot
More information about the Users