[ClusterLabs] Antw: Re: questions about startup fencing

Jan Pokorný jpokorny at redhat.com
Tue Dec 5 10:09:39 UTC 2017


On 05/12/17 10:01 +0100, Tomas Jelinek wrote:
> The first attempt to fix the issue was to put nodes into standby mode with
> --lifetime=reboot:
> https://github.com/ClusterLabs/pcs/commit/ea6f37983191776fd46d90f22dc1432e0bfc0b91
> 
> This didn't work for several reasons. One of them was back then there was no
> reliable way to set standby mode with --lifetime=reboot for more than one
> node in a single step. (This may have been fixed in the meantime.) There
> were however other serious reasons for not putting the nodes into standby as
> was explained by Andrew:
> - it [putting the nodes into standby first] means shutdown takes longer (no
> node stops until all the resources stop)
> - it makes shutdown more complex (== more fragile), eg...
> - it result in pcs waiting forever for resources to stop
>   - if a stop fails and the cluster is configured to start at boot, then the
>     node will get fenced and happily run resources when it returns
>     (because all the nodes are up so we still have quorum)

Isn't one-off stopping of a cluster without actually disabling cluster
software to run on boot rather antithetical?

And beside, isn't this ressurection scenario possible also with the
current parallel (hence subject to race condition) stop in such case,
anyway?

> - only potentially benefits resources that have no (or very few) dependants
> and can stop quicker than it takes pcs to get through its "initiate parallel
> shutdown" loop (which should be rather fast since there is no ssh connection
> setup overheads)
> 
> So we ended up with just stopping pacemaker in parallel:
> https://github.com/ClusterLabs/pcs/commit/1ab2dd1b13839df7e5e9809cde25ac1dbae42c3d

-- 
Jan (Poki)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20171205/dbb6448e/attachment-0002.sig>


More information about the Users mailing list