[ClusterLabs] Antw: Re: Antw: Re: Antw: [EXT] Stonith failing

Reid Wahl nwahl at redhat.com
Thu Jul 30 05:04:09 EDT 2020


That appears to support IPMI, so fence_ipmilan is likely an option.
Further, it probably has a watchdog device. If so, then sbd is an option.

On Thu, Jul 30, 2020 at 2:00 AM Gabriele Bulfon <gbulfon at sonicle.com> wrote:

> It is this system:
>
> https://www.supermicro.com/products/system/1u/1029/SYS-1029TP-DC0R.cfm
>
> it has a sas3 backplane with hotswap sas disks that are visible to both
> nodes at the same time.
>
> Gabriele
>
>
>
> *Sonicle S.r.l. *: http://www.sonicle.com
> *Music: *http://www.gabrielebulfon.com
> *Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon
>
>
>
>
> ----------------------------------------------------------------------------------
>
> Da: Ulrich Windl <Ulrich.Windl at rz.uni-regensburg.de>
> A: users at clusterlabs.org
> Data: 29 luglio 2020 15.15.17 CEST
> Oggetto: [ClusterLabs] Antw: Re: Antw: Re: Antw: [EXT] Stonith failing
>
> >>> Gabriele Bulfon <gbulfon at sonicle.com> schrieb am 29.07.2020 um 14:18
> in
> Nachricht <479956351.444.1596025101064 at www>:
> > Hi, it's a single controller, shared to both nodes, SM server.
>
> You mean external controller, like NAS or SAN? I thought you are talking
> about
> an internal controller like SCSI...
> I don't know what an "SM server" is.
>
> Regards,
> Ulrich
>
> >
> > Thanks!
> > Gabriele
> >
> >
> > Sonicle S.r.l.
> > :
> > http://www.sonicle.com
> > Music:
> > http://www.gabrielebulfon.com
> > Quantum Mechanics :
> > http://www.cdbaby.com/cd/gabrielebulfon
> >
>
> ----------------------------------------------------------------------------
> > ------
> > Da: Ulrich Windl
> > A: users at clusterlabs.org
> > Data: 29 luglio 2020 9.26.39 CEST
> > Oggetto: [ClusterLabs] Antw: Re: Antw: [EXT] Stonith failing
> > Gabriele Bulfon
> > schrieb am 29.07.2020 um 08:01 in
> > Nachricht
> > :
> > That one was taken from a specific implementation on Solaris 11.
> > The situation is a dual node server with shared storage controller: both
> > nodes see the same disks concurrently.
> > You mean you have a dual-controler setup (one controller on each node,
> both
> > connected to the same bus)? If so Use sbd!
> > Here we must be sure that the two nodes are not going to import/mount the
> > same zpool at the same time, or we will encounter data corruption: node 1
> > will be perferred for pool 1, node 2 for pool 2, only in case one of the
> > node
> > goes down or is taken offline the resources should be first free by the
> > leaving node and taken by the other node.
> > Would you suggest one of the available stonith in this case?
> > Thanks!
> > Gabriele
> > Sonicle S.r.l.
> > :
> > http://www.sonicle.com
> > Music:
> > http://www.gabrielebulfon.com
> > Quantum Mechanics :
> > http://www.cdbaby.com/cd/gabrielebulfon
> >
>
> ----------------------------------------------------------------------------
> > ------
> > Da: Strahil Nikolov
> > A: Cluster Labs - All topics related to open-source clustering welcomed
> > Gabriele Bulfon
> > Data: 29 luglio 2020 6.39.08 CEST
> > Oggetto: Re: [ClusterLabs] Antw: [EXT] Stonith failing
> > Do you have a reason not to use any stonith already available ?
> > Best Regards,
> > Strahil Nikolov
> > На 28 юли 2020 г. 13:26:52 GMT+03:00, Gabriele Bulfon
> > написа:
> > Thanks, I attach here the script.
> > It basically runs ssh on the other node with no password (must be
> > preconfigured via authorization keys) with commands.
> > This was taken from a script by OpenIndiana (I think).
> > As it stated in the comments, we don't want to halt or boot via ssh,
> > only reboot.
> > Maybe this is the problem, we should at least have it shutdown when
> > asked for.
> > Actually if I stop corosync in node 2, I don't want it to shutdown the
> > system but just let node 1 keep control of all resources.
> > Same if I just shutdown manually node 2,
> > node 1 should keep control of all resources and release them back on
> > reboot.
> > Instead, when I stopped corosync on node 2, log was showing the
> > temptative to stonith node 2: why?
> > Thanks!
> > Gabriele
> > Sonicle S.r.l.
> > :
> > http://www.sonicle.com
> > Music:
> > http://www.gabrielebulfon.com
> > Quantum Mechanics :
> > http://www.cdbaby.com/cd/gabrielebulfon
> > Da:
> > Reid Wahl
> > A:
> > Cluster Labs - All topics related to open-source clustering welcomed
> > Data:
> > 28 luglio 2020 12.03.46 CEST
> > Oggetto:
> > Re: [ClusterLabs] Antw: [EXT] Stonith failing
> > Gabriele,
> > "No route to host" is a somewhat generic error message when we can't
> > find anyone to fence the node. It doesn't mean there's necessarily a
> > network routing issue at fault; no need to focus on that error message.
> > I agree with Ulrich about needing to know what the script does. But
> > based on your initial message, it sounds like your custom fence agent
> > returns 1 in response to "on" and "off" actions. Am I understanding
> > correctly? If so, why does it behave that way? Pacemaker is trying to
> > run a poweroff action based on the logs, so it needs your script to
> > support an off action.
> > On Tue, Jul 28, 2020 at 2:47 AM Ulrich Windl
> > Ulrich.Windl at rz.uni-regensburg.de
> > wrote:
> > Gabriele Bulfon
> > gbulfon at sonicle.com
> > schrieb am 28.07.2020 um 10:56 in
> > Nachricht
> > :
> > Hi, now I have my two nodes (xstha1 and xstha2) with IPs configured by
> > Corosync.
> > To check how stonith would work, I turned off Corosync service on
> > second
> > node.
> > First node try to attempt to stonith 2nd node and take care of its
> > resources, but this fails.
> > Stonith action is configured to run a custom script to run ssh
> > commands,
> > I think you should explain what that script does exactly.
> > [...]
> > _______________________________________________
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> > ClusterLabs home:
> > https://www.clusterlabs.org/
> > --
> > Regards,
> > Reid Wahl, RHCA
> > Software Maintenance Engineer, Red Hat
> > CEE - Platform Support Delivery - ClusterHA
> > _______________________________________________Manage your
> > subscription:
> https://lists.clusterlabs.org/mailman/listinfo/usersClusterLabs
>
> > home: https://www.clusterlabs.org/
> > _______________________________________________
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> > ClusterLabs home: https://www.clusterlabs.org/
>
>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>


-- 
Regards,

Reid Wahl, RHCA
Software Maintenance Engineer, Red Hat
CEE - Platform Support Delivery - ClusterHA
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20200730/7bb0a110/attachment.htm>


More information about the Users mailing list