[ClusterLabs] Antw: Re: Best-practices for changing networks settings in a cluster?

Roger Z. zzhou at suse.com
Tue Nov 6 05:49:12 EST 2018



On Nov 6, 2018, at 2:59 PM, Ulrich Windl <Ulrich.Windl at rz.uni-regensburg.de> wrote:

>>>> Ken Gaillot <kgaillot at redhat.com> schrieb am 06.11.2018 um 00:12 in Nachricht
> <1541459570.5061.11.camel at redhat.com>:
>>> On Mon, 2018-11-05 at 16:14 -0600, Ryan Thomas wrote:
>>> I have a two node cluster.  I restart the network after making
>>> changes to the network settings.  But, as soon as I restart the
>>> network I see that corosync/pacemaker are killed - causing resources
>>> to failover to the other node.  It looks like this is due to https://
>>> github.com/corosync/corosync/issues/348, this issue points that that
>>> corosync cannot handle downing an network interface with ifdown.  I'd
>>> like to avoid this, but I'd still like to be able to change the
>>> network settings. What is the best-practice for changing network
>>> settings on a cluster?  
>>> 
>>> The best workaround I can think of is to kill pacemaker on each
>>> process, make the network changes, and then restart pacemaker. 
>>> However, this seems pretty ugly and error-prone.  Is there  away to
>>> "pause" pacemaker for the whole cluster?  
>>> 
>>> Thanks in advance for your advice.
>>> Ryan
>> 
>> Hi,
>> 
>> Yes, maintenance mode is exactly for this purpose. You can set the
>> maintenance-mode cluster property to true, stop pacemaker and corosync,
>> update the network, start corosync and pacemaker, then set maintenance-
>> mode back to false.
> 
> Hi!
> 
> Does this still hold when running DLM, cLVM and/or OCFS2? In my experience the nodes were fences still...
> 


As long as users plan the maintenance event carefully, users can 

- apply maintenance-mode, 
- plus disable stonith. 
- In the end, do revert those settings. 

The advantage is the application stack stays, and doesn’t get rebooted which can be very time consuming. 

Well, users need understand the limitations during the time:
1) the cluster has no Pacemaker protection for HA. 
2) DLM lock will hang for its applications, eg. clvm and ocfs2. 

Cheers,
Roger


> Regards,
> Ulrich
> 
>> -- 
>> Ken Gaillot <kgaillot at redhat.com>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org 
>> https://lists.clusterlabs.org/mailman/listinfo/users 
>> 
>> Project Home: http://www.clusterlabs.org 
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>> Bugs: http://bugs.clusterlabs.org 
> 
> 
> 
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 




More information about the Users mailing list