[ClusterLabs] Antw: changing on-fail action default

Sun Oct 22 08:44:57 EDT 2017

OK, thanks to all of you guys!

this did it (via crm):

op_defaults op_defaults-options: \
	    on-fail=block

cheers!

nik

On Thu, Oct 19, 2017 at 09:21:10AM -0500, Ken Gaillot wrote:
> On Thu, 2017-10-19 at 17:08 +0900, Christian Balzer wrote:
> > On Thu, 19 Oct 2017 09:57:31 +0200 Ulrich Windl wrote:
> > 
> > > > > > Nikola Ciprich <nikola.ciprich at linuxbox.cz> schrieb am
> > > > > > 19.10.2017 um 09:46 in  
> > > 
> > > Nachricht <20171019074630.GA23856 at pcnci.linuxbox.cz>:
> > > > Hi fellow pacemaker users,  
> > > 
> > > Hi!
> > > 
> > > > 
> > > > I'd like to ask, if it is possible to change on-fail default
> > > > action.
> > > > I don't want it to be "fence" but "block" even for clusters with
> > > > fencing.  
> 
> Yes, there is a rsc_defaults section to set defaults for resource
> attributes, and an op_defaults section to set defaults for resource
> operation attributes (including on-fail). The command you use to set
> them depends on what tool you're using (e.g., with pcs, see "pcs
> resource op defaults").
> 
> > > 
> > > This would mean any cluster with a problem would require manual
> > > intervention!
> > > 
> > > > 
> > > > but I don't want to have to change it for each resource..
> > > > 
> > > > is it possible to set global default?  
> > > 
> > > See above. Still I understand what you are asking for. What's
> > > missing in pacemaker is a "time to fix the mess" interval (I
> > > vaguely remember HP-UX ServiceGuard had such a thing): So if the
> > > cluster detects a problem that would cause a node fencing, the
> > > cluster waits whether things change within some seconds or minutes,
> > > and then (if things are still bad) the node is fenced. However if
> > > the reason for fencing is no longer there, no fencing will be
> > > done...
> 
> You can set delays on fence actions. Some fence agents have delay
> parameters themselves, or you can set it at the pacemaker level with
> pcmk_max_delay (for a random delay) or (with the forthcoming 1.1.18)
> pcmk_delay_base (for a fixed delay). So, you could even set
> pcmk_delay_base=60m to wait an hour before executing fencing.
> 
> > > As far as I understand pacemaker, a fencing request cannot be
> > > revoked once issued  (it's in the queue of actions).
> 
> Correct, so even with the delay, fencing would eventually be done, but
> it would give you time to investigate and prepare for the shutdown.
> 
> There has been some discussion recently about allowing fencing to be
> cancelled under certain situations. The easiest to implement would be
> to be able to cancel fencing if it's in the delay period (so, no
> commands have been sent yet to any devices). The idea that was
> discussed was to cancel any delayed operations when a fence device is
> disabled in the configuration.
> 
> > Yeah, that strict sequential operating can be a major PITA,
> > especially if
> > the reason for whatever action has long gone.
> > 
> > Christian
> 
> Even with any of the above suggestions, there will always have to be a
> strictness about fencing before recovery. If the cluster can't
> communicate with the node, fencing is the only way to be sure it's
> unable to cause conflicts.
> 
> But, it's fine for "fencing" to be manual, i.e. having an admin
> manually investigate, reboot the machine, and use stonith_admin --
> confirm to say that fencing has been done.
> -- 
> Ken Gaillot <kgaillot at redhat.com>
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 

-- 
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28.rijna 168, 709 00 Ostrava

tel.:   +420 591 166 214
fax:    +420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: servis at linuxbox.cz
-------------------------------------