[ClusterLabs] Antw: Re: Antw: Re: Antw: [EXT] delaying start of a resource

Gabriele Bulfon gbulfon at sonicle.com
Thu Dec 17 11:55:26 EST 2020


Would a change of network class on one node be ok?
 
 
Sonicle S.r.l. : http://www.sonicle.com
Music: http://www.gabrielebulfon.com
eXoplanets : https://gabrielebulfon.bandcamp.com/album/exoplanets
 




----------------------------------------------------------------------------------

Da: Ulrich Windl <Ulrich.Windl at rz.uni-regensburg.de>
A: users at clusterlabs.org 
Data: 17 dicembre 2020 12.26.29 CET
Oggetto: [ClusterLabs] Antw: Re: Antw: Re: Antw: [EXT] delaying start of a resource


>>> Gabriele Bulfon <gbulfon at sonicle.com> schrieb am 17.12.2020 um 09:14 in
Nachricht <2080536991.1106.1608192888030 at www>:
> I see, but then I have to issues:
> 
> 1. it is a dual node server, the HA interface is internal, I have no way to 
> unplug it, that's why I tried turning it down

You could block traffic using iptables or a "blackhole" route for example.

> 
> 2. even in case I could test it by unplugging it, there is still the 
> possibility that someone turns the interface down, causing a bad situation 
> for the zpool...so I would like to understand why xstha2 decided to turn on 
> IP and zpool when stonish of xstha1 was not yet done...

What should a HA software do when an admin turns down the interface?
I'm afraid there is no HA software against adinistrator errors.
It's important to understand that HA software helps against errors from hardware or software, but not against configuration errors (which an ifdown is).

> 
> 
> Sonicle S.r.l. : http://www.sonicle.com 
> Music: http://www.gabrielebulfon.com 
> eXoplanets : https://gabrielebulfon.bandcamp.com/album/exoplanets 
> 
> 
> 
> 
> 
> ----------------------------------------------------------------------------
> ------
> 
> Da: Ulrich Windl <Ulrich.Windl at rz.uni-regensburg.de>
> A: users at clusterlabs.org 
> Data: 17 dicembre 2020 7.48.46 CET
> Oggetto: [ClusterLabs] Antw: Re: Antw: [EXT] delaying start of a resource
> 
> 
>>>> Gabriele Bulfon <gbulfon at sonicle.com> schrieb am 16.12.2020 um 15:56 in
> Nachricht <386755316.773.1608130588146 at www>:
>> Thanks, here are the logs, there are infos about how it tried to start 
>> resources on the nodes.
>> Keep in mind the node1 was already running the resources, and I simulated a 
>> problem by turning down the ha interface.
> 
> Please note that "turning down" an interface is NOT a realistic test; 
> realistic would be to unplug the cable.
> 
>> 
>> Gabriele
>> 
>> 
>> Sonicle S.r.l. : http://www.sonicle.com 
>> Music: http://www.gabrielebulfon.com 
>> eXoplanets : https://gabrielebulfon.bandcamp.com/album/exoplanets 
>> 
>> 
>> 
>> 
>> 
>> ----------------------------------------------------------------------------
>> ------
>> 
>> Da: Ulrich Windl <Ulrich.Windl at rz.uni-regensburg.de>
>> A: users at clusterlabs.org 
>> Data: 16 dicembre 2020 15.45.36 CET
>> Oggetto: [ClusterLabs] Antw: [EXT] delaying start of a resource
>> 
>> 
>>>>> Gabriele Bulfon <gbulfon at sonicle.com> schrieb am 16.12.2020 um 15:32 in
>> Nachricht <1523391015.734.1608129155836 at www>:
>>> Hi, I have now a two node cluster using stonith with different 
>>> pcmk_delay_base, so that node 1 has priority to stonith node 2 in case of 
>>> problems.
>>> 
>>> Though, there is still one problem: once node 2 delays its stonith action 
>>> for 10 seconds, and node 1 just 1, node 2 does not delay start of resources, 
> 
>> 
>>> so it happens that while it's not yet powered off by node 1 (and waiting its 
> 
>> 
>>> dalay to power off node 1) it actually starts resources, causing a moment of 
> 
>> 
>>> few seconds where both NFS IP and ZFS pool (!!!!!) is mounted by both!
>> 
>> AFAIK pacemaker will not start resources on a node that is scheduled for 
>> stonith. Even more: Pacemaker will tra to stop resources on a node scheduled 
> 
>> for stonith to start them elsewhere.
>> 
>>> How can I delay node 2 resource start until the delayed stonith action is 
>>> done? Or how can I just delay the resource start so I can make it larger 
>> than 
>>> its pcmk_delay_base?
>> 
>> We probably need to see logs and configs to understand.
>> 
>>> 
>>> Also, I was suggested to set "stonith-enabled=true", but I don't know where 
>>> to set this flag (cib-bootstrap-options is not happy with it...).
>> 
>> I think it's on by default, so you must have set it to false.
>> In crm shell it is "configure# property stonith-enabled=...".
>> 
>> Regards,
>> Ulrich
>> 
>> 
>> _______________________________________________
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users 
>> 
>> ClusterLabs home: https://www.clusterlabs.org/ 
> 
> 
> 
> 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 



_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20201217/543e4b6b/attachment.htm>


More information about the Users mailing list