[Pacemaker] how to mount drive on SAN with pacemakerresourceagent?

Mike Diehn mike.diehn at ansys.com
Thu Jan 6 11:23:17 EST 2011


Ah, now those are the questions I'm asking now myself!
I hope someone else will step in now and offer some advice.

In a simple case, we want to prevent two nodes using a shared file-system
without coordination.  If they loose contact with each other and can't
coordinate their access, one must die.  (Even if you make it hard for them
to loose contact by providing multiple communication channels, you have to
assume it will happen and, so, we work with this case.)

The worst case is if each node thinks it is healthy but it can't coordinate
with it's neighbor. It's really easy to accidentally implement gunfight
stonith: whichever node can fence (STONITH) his neighbor fastest gets to
live.  In my lab, when I screwed this up, the machines would shoot each
other over and over and over as they rebooted.

So, what are folks doing?


On Thu, Jan 6, 2011 at 10:45 AM, Michael Hittesdorf <
michael.hittesdorf at chicagotrading.com> wrote:

>  This is great information. Thanks.  I was wondering what criteria is used
> to determine that a ‘sick’ node should be killed? If it can’t be contacted
> over the network for some length of time? If the resources can’t be
> restarted on the box? What I’m most worried about is the scenario where my
> backup loses contact with the primary due to a network failure and the
> backup takes over even though the master is still running.  This would cause
> both nodes to mount my SAN attached storage and potentially corrupt it.
> I’ve actually forced this to happen by disconnecting the master’s network
> adapter on my test cluster.  I wound up with a split brain situation where
> both nodes were actively running.  Would a STONITH device kill the master if
> the master could not be contacted over the network? Or would the STONITH
> device indicate that the master was ok and prevent the unwanted failover
> from occurring and thus prevent the split brain scenario I just described?
>
>
>
> Thanks for all your help. It is much appreciated!
>
>
>
> Mick
>
>
>  ------------------------------
>
> *From:* Mike Diehn [mailto:mike.diehn at ansys.com]
> *Sent:* Thursday, January 06, 2011 9:16 AM
>
> *To:* The Pacemaker cluster resource manager
> *Subject:* Re: [Pacemaker] how to mount drive on SAN with
> pacemakerresourceagent?
>
>
>
>
>
> You want a STONITH tool that will let your nodes positively kill one
> another without needing to rely on the "sick" node for anything.  So, the
> ideal solution is, yes, a networked power device.  Something that will let
> you power-off the sick node remotely.
>
>
>
> Lacking that, you could use IPMI tool if your servers have BMCs.  Almost
> all server class machines do today.  Things like Sun ILOM, Dell DRAC, HP
> iLO.
>
>
>
> The modules and scripts in /usr/lib64/stonith/plugins will give you an idea
> of what's available already.
>
>
>
> Do try to resist the temptation to use ssh to issue a shutdown command.
>  That's really just not useful and if you implement it, you check off
> 'implement stonith' on your list and move happily on thinking you're shared
> file-system is now safe.  When it isn't.
>
>
>
> Does that help?  Oh, one more thing, it took me an embarassingly long time
> to discover that there is a "stonith" command and a bunch of related
> "stuff."  On my SLES 11 SP1 systems, with the HA Extension Add-on, the
> stonith stuff came in as part of RPM package cluster-glue-1.0.5-0.5.1.
>
>
>
> Best,
>
> Mike
>
>
>
> On Thu, Jan 6, 2011 at 9:53 AM, Michael Hittesdorf <
> michael.hittesdorf at chicagotrading.com> wrote:
>
> Thanks for your reply. I now have the Filesystem resource working on my
> test cluster. I’ve done some reading on STONITH as you suggested and am now
> wondering how I determine what STONITH devices are actually available on my
> servers and which one I should choose?  The recommendation I’ve read
> suggests the use of an external UPS that can be monitored over the network.
> Is this the best approach? Are there other STONITH devices that are commonly
> used? Why choose one over the other?
>
>
>
> Thanks in advance.  Mick
>
>
>  ------------------------------
>
> *From:* Mike Diehn [mailto:mike.diehn at ansys.com]
> *Sent:* Tuesday, January 04, 2011 2:54 PM
> *To:* The Pacemaker cluster resource manager
> *Subject:* Re: [Pacemaker] how to mount drive on SAN with pacemaker
> resourceagent?
>
>
>
>
>
> To make sure the failed server is actually dead, you want to use STONITH.
>  So read about that.  Here are examples from our testing cluster.  These are
> broken, so don't use them as they are.  That's why they are set to "Stopped"
> right now.  I probably have some timing stuff very wrong:
>
>
>
>
>
>  primitive ShootLebekmfs1 stonith:external/ipmi \
>
>         meta target-role="Stopped" \
>
>         params hostname="lebekmfs1" ipaddr="10.1.1.59" userid="stonith"
> passwd="ShootMeInTheHead" interface="lan"
>
> primitive ShootLebekmfs2 stonith:external/ipmi \
>
>         meta target-role="Stopped" \
>
>         params hostname="lebekmfs2" ipaddr="10.1.1.61" userid="stonith"
> passwd="ShootMeInTheHead" interface="lan"
>
>
>
> You can use the ocf:heartbeat:Filesystem resource to mount any file system
> you can mount manually.  Here's one from a config in our test cluster.  This
> works:
>
>
>
>  primitive lvTest ocf:heartbeat:Filesystem \
>
>         params device="/dev/EkmCluVG/lvTest" directory="/srv/test1"
> fstype="ocfs2" \
>
>         op monitor interval="10s" timeout="10s"
>
>
>
> Make sure you remove the file system from your /etc/fstab if you're going
> to do it this way.  During testing, for my convenience, I leave it in, but
> add the noauto option to prevent it being mounted on boot.
>
>
>
> Best,
>
> Mike
>
>
>
> On Tue, Jan 4, 2011 at 2:05 PM, Michael Hittesdorf <
> michael.hittesdorf at chicagotrading.com> wrote:
>
> Can I use the Filesystem resource agent to mount a SAN drive in the event
> of a failover? How do I ensure that the failed server no longer has the
> drive mounted so as to prevent storage corruption? Having read several of
> the tutorials, I’m aware of DRBD and the clustered file systems GFS2 and
> OCFS2.  However, I don’t need simultaneous access to the disk from both of
> my cluster nodes. I just want to make the shared SAN storage available to
> the primary, active server only as my cluster is active-passive.  Is there a
> recommended way to accomplish this?
>
>
>
> Thanks for your help!
>
> This message is intended only for the personal and confidential use of the
> recipients named above. If the reader of this email is not the intended
> recipient, you have received this email in error and any review,
> dissemination, distribution or copying is strictly prohibited. If you have
> received this email in error, please notify the sender immediately by return
> email and permanently delete the copy you received. This message is provided
> for informational purposes and should not be construed as a solicitation or
> offer to buy or sell any securities or related financial instruments.
> Neither CTC Holdings nor any affiliates (CTC) are responsible for any
> recommendation, solicitation, offer or agreement or any information about
> any transaction, customer account or account activity that may be attached
> to or contained in this communication. CTC accepts no liability for any
> content contained in the email, or any errors or omissions arising as a
> result of e-mail transmission. Any opinions contained in this email
> constitute the sender's best judgment at this time and are subject to change
> without notice. CTC London Limited is authorized and regulated by the
> Financial Services Authority.
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
>
>
>
> --
> Mike Diehn
> Senior Systems Administrator
> ANSYS, Inc - Lebanon, NH Office
> mike.diehn at ansys.com, (603) 727-5492
>
> This message is intended only for the personal and confidential use of the
> recipients named above. If the reader of this email is not the intended
> recipient, you have received this email in error and any review,
> dissemination, distribution or copying is strictly prohibited. If you have
> received this email in error, please notify the sender immediately by return
> email and permanently delete the copy you received. This message is provided
> for informational purposes and should not be construed as a solicitation or
> offer to buy or sell any securities or related financial instruments.
> Neither CTC Holdings nor any affiliates (CTC) are responsible for any
> recommendation, solicitation, offer or agreement or any information about
> any transaction, customer account or account activity that may be attached
> to or contained in this communication. CTC accepts no liability for any
> content contained in the email, or any errors or omissions arising as a
> result of e-mail transmission. Any opinions contained in this email
> constitute the sender's best judgment at this time and are subject to change
> without notice. CTC London Limited is authorized and regulated by the
> Financial Services Authority.
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
>
>
>
> --
> Mike Diehn
> Senior Systems Administrator
> ANSYS, Inc - Lebanon, NH Office
> mike.diehn at ansys.com, (603) 727-5492
>
> This message is intended only for the personal and confidential use of the
> recipients named above. If the reader of this email is not the intended
> recipient, you have received this email in error and any review,
> dissemination, distribution or copying is strictly prohibited. If you have
> received this email in error, please notify the sender immediately by return
> email and permanently delete the copy you received. This message is provided
> for informational purposes and should not be construed as a solicitation or
> offer to buy or sell any securities or related financial instruments.Neither CTC Holdings nor any affiliates (CTC) are responsible for any
> recommendation, solicitation, offer or agreement or any information about
> any transaction, customer account or account activity that may be attached
> to or contained in this communication. CTC accepts no liability for any
> content contained in the email, or any errors or omissions arising as a
> result of e-mail transmission. Any opinions contained in this email
> constitute the sender's best judgment at this time and are subject to change
> without notice. CTC London Limited is authorized and regulated by the
> Financial Services Authority.
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
>


-- 
Mike Diehn
Senior Systems Administrator
ANSYS, Inc - Lebanon, NH Office
mike.diehn at ansys.com, (603) 727-5492
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20110106/d8c59994/attachment-0001.html>


More information about the Pacemaker mailing list