[ClusterLabs] Antw: Re: Antw: [EXT] delaying start of a resource

Gabriele Bulfon gbulfon at sonicle.com
Thu Dec 17 03:14:48 EST 2020


I see, but then I have to issues:
 
1. it is a dual node server, the HA interface is internal, I have no way to unplug it, that's why I tried turning it down
 
2. even in case I could test it by unplugging it, there is still the possibility that someone turns the interface down, causing a bad situation for the zpool...so I would like to understand why xstha2 decided to turn on IP and zpool when stonish of xstha1 was not yet done...
 
 
Sonicle S.r.l. : http://www.sonicle.com
Music: http://www.gabrielebulfon.com
eXoplanets : https://gabrielebulfon.bandcamp.com/album/exoplanets
 




----------------------------------------------------------------------------------

Da: Ulrich Windl <Ulrich.Windl at rz.uni-regensburg.de>
A: users at clusterlabs.org 
Data: 17 dicembre 2020 7.48.46 CET
Oggetto: [ClusterLabs] Antw: Re: Antw: [EXT] delaying start of a resource


>>> Gabriele Bulfon <gbulfon at sonicle.com> schrieb am 16.12.2020 um 15:56 in
Nachricht <386755316.773.1608130588146 at www>:
> Thanks, here are the logs, there are infos about how it tried to start 
> resources on the nodes.
> Keep in mind the node1 was already running the resources, and I simulated a 
> problem by turning down the ha interface.

Please note that "turning down" an interface is NOT a realistic test; realistic would be to unplug the cable.

> 
> Gabriele
> 
> 
> Sonicle S.r.l. : http://www.sonicle.com 
> Music: http://www.gabrielebulfon.com 
> eXoplanets : https://gabrielebulfon.bandcamp.com/album/exoplanets 
> 
> 
> 
> 
> 
> ----------------------------------------------------------------------------
> ------
> 
> Da: Ulrich Windl <Ulrich.Windl at rz.uni-regensburg.de>
> A: users at clusterlabs.org 
> Data: 16 dicembre 2020 15.45.36 CET
> Oggetto: [ClusterLabs] Antw: [EXT] delaying start of a resource
> 
> 
>>>> Gabriele Bulfon <gbulfon at sonicle.com> schrieb am 16.12.2020 um 15:32 in
> Nachricht <1523391015.734.1608129155836 at www>:
>> Hi, I have now a two node cluster using stonith with different 
>> pcmk_delay_base, so that node 1 has priority to stonith node 2 in case of 
>> problems.
>> 
>> Though, there is still one problem: once node 2 delays its stonith action 
>> for 10 seconds, and node 1 just 1, node 2 does not delay start of resources, 
> 
>> so it happens that while it's not yet powered off by node 1 (and waiting its 
> 
>> dalay to power off node 1) it actually starts resources, causing a moment of 
> 
>> few seconds where both NFS IP and ZFS pool (!!!!!) is mounted by both!
> 
> AFAIK pacemaker will not start resources on a node that is scheduled for 
> stonith. Even more: Pacemaker will tra to stop resources on a node scheduled 
> for stonith to start them elsewhere.
> 
>> How can I delay node 2 resource start until the delayed stonith action is 
>> done? Or how can I just delay the resource start so I can make it larger 
> than 
>> its pcmk_delay_base?
> 
> We probably need to see logs and configs to understand.
> 
>> 
>> Also, I was suggested to set "stonith-enabled=true", but I don't know where 
>> to set this flag (cib-bootstrap-options is not happy with it...).
> 
> I think it's on by default, so you must have set it to false.
> In crm shell it is "configure# property stonith-enabled=...".
> 
> Regards,
> Ulrich
> 
> 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 




_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20201217/63b8c1b2/attachment.htm>


More information about the Users mailing list