[ClusterLabs] Antw: Re: Antw: [EXT] Stonith failing

Wed Jul 29 03:26:39 EDT 2020

>>> Gabriele Bulfon <gbulfon at sonicle.com> schrieb am 29.07.2020 um 08:01 in
Nachricht <603366395.379.1596002482554 at www>:
> That one was taken from a specific implementation on Solaris 11.
> The situation is a dual node server with shared storage controller: both 
> nodes see the same disks concurrently.

You mean you have a dual-controler setup (one controller on each node, both
connected to the same bus)? If so Use sbd!

> Here we must be sure that the two nodes are not going to import/mount the 
> same zpool at the same time, or we will encounter data corruption: node 1 
> will be perferred for pool 1, node 2 for pool 2, only in case one of the
node 
> goes down or is taken offline the resources should be first free by the 
> leaving node and taken by the other node.
>  
> Would you suggest one of the available stonith in this case?
>  
> Thanks!
> Gabriele
>  
>  
>  
> Sonicle S.r.l. 
> : 
> http://www.sonicle.com 
> Music: 
> http://www.gabrielebulfon.com 
> Quantum Mechanics : 
> http://www.cdbaby.com/cd/gabrielebulfon 
>
----------------------------------------------------------------------------
> ------
> Da: Strahil Nikolov
> A: Cluster Labs - All topics related to open-source clustering welcomed
> Gabriele Bulfon
> Data: 29 luglio 2020 6.39.08 CEST
> Oggetto: Re: [ClusterLabs] Antw: [EXT] Stonith failing
> Do you have a reason not to use any stonith already available ?
> Best Regards,
> Strahil Nikolov
> На 28 юли 2020 г. 13:26:52 GMT+03:00, Gabriele Bulfon
> написа:
> Thanks, I attach here the script.
> It basically runs ssh on the other node with no password (must be
> preconfigured via authorization keys) with commands.
> This was taken from a script by OpenIndiana (I think).
> As it stated in the comments, we don't want to halt or boot via ssh,
> only reboot.
> Maybe this is the problem, we should at least have it shutdown when
> asked for.
>  
> Actually if I stop corosync in node 2, I don't want it to shutdown the
> system but just let node 1 keep control of all resources.
> Same if I just shutdown manually node 2, 
> node 1 should keep control of all resources and release them back on
> reboot.
> Instead, when I stopped corosync on node 2, log was showing the
> temptative to stonith node 2: why?
>  
> Thanks!
> Gabriele
>  
>  
>  
> Sonicle S.r.l. 
> : 
> http://www.sonicle.com 
> Music: 
> http://www.gabrielebulfon.com 
> Quantum Mechanics : 
> http://www.cdbaby.com/cd/gabrielebulfon 
> Da:
> Reid Wahl
> A:
> Cluster Labs - All topics related to open-source clustering welcomed
> Data:
> 28 luglio 2020 12.03.46 CEST
> Oggetto:
> Re: [ClusterLabs] Antw: [EXT] Stonith failing
> Gabriele,
>  
> "No route to host" is a somewhat generic error message when we can't
> find anyone to fence the node. It doesn't mean there's necessarily a
> network routing issue at fault; no need to focus on that error message.
>  
> I agree with Ulrich about needing to know what the script does. But
> based on your initial message, it sounds like your custom fence agent
> returns 1 in response to "on" and "off" actions. Am I understanding
> correctly? If so, why does it behave that way? Pacemaker is trying to
> run a poweroff action based on the logs, so it needs your script to
> support an off action.
> On Tue, Jul 28, 2020 at 2:47 AM Ulrich Windl
> Ulrich.Windl at rz.uni-regensburg.de 
> wrote:
> Gabriele Bulfon
> gbulfon at sonicle.com 
> schrieb am 28.07.2020 um 10:56 in
> Nachricht
> :
> Hi, now I have my two nodes (xstha1 and xstha2) with IPs configured by
> Corosync.
> To check how stonith would work, I turned off Corosync service on
> second
> node.
> First node try to attempt to stonith 2nd node and take care of its
> resources, but this fails.
> Stonith action is configured to run a custom script to run ssh
> commands,
> I think you should explain what that script does exactly.
> [...]
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> ClusterLabs home:
> https://www.clusterlabs.org/ 
> --
> Regards,
> Reid Wahl, RHCA
> Software Maintenance Engineer, Red Hat
> CEE - Platform Support Delivery - ClusterHA
> _______________________________________________Manage your
> subscription:https://lists.clusterlabs.org/mailman/listinfo/usersClusterLabs

> home: https://www.clusterlabs.org/