[ClusterLabs] How to fence cluster node when SAN filesystem fail

Wed May 3 02:20:06 UTC 2017

Hi Klaus,

Thank you for your quickly reply.

Below is my crm_resource output :
# crm_resource --resource ora_fs -query-xml
ora_fs (ocf::heartbeat:Filesystem):    Started node2.albertlab.com
xml:
<primitive class="ocf" id="ora_fs" provider="heartbeat" type="Filesystem">
  <instance_attributes id="ora_fs-instance_attributes">
    <nvpair id="ora_fs-instance_attributes-device" name="device"
value="/dev/mapper/volgroup-lvgroup"/>
    <nvpair id="ora_fs-instance_attributes-directory" name="directory"
value="/oradata"/>
    <nvpair id="ora_fs-instance_attributes-fstype" name="fstype"
value="ext4"/>
  </instance_attributes>
  <operations>
    <op id="ora_fs-start-interval-0s" interval="0s" name="start"
timeout="60"/>
    <op id="ora_fs-stop-interval-0s" interval="0s" name="stop"
timeout="60"/>
    <op id="ora_fs-monitor-interval-20" interval="20" name="monitor"
timeout="40"/>
  </operations>
</primitive>

I checked the document about 'on-fail' operation, you're right, my
filesystem resource behavior work correctly , it failover to another node
to 'restart' the resource. so, if i add the on-fail parameter for "move to
another node and fence the node itself" purpose as below, am i right ?

# pcs resource op remove ora_fs monitor
# pcs resource op add ora_fs monitor interval=20 timeout=40 on-fail=fence

I'm curios about why only my 'filesystem' resource will not trigger
stonith, but when 'vip' resource fail, the resource located host will be
triggered reboot immediately. (stonith : fence_ipmilan)

here is my 'ora_vip' resource output
# crm_resource --resource ora_vip --query-xml
ora_vip        (ocf::heartbeat:IPaddr2):       Started node2.albertlab.com
xml:
<primitive class="ocf" id="ora_vip" provider="heartbeat" type="IPaddr2">
  <instance_attributes id="ora_vip-instance_attributes">
    <nvpair id="ora_vip-instance_attributes-ip" name="ip"
value="192.168.11.10"/>
    <nvpair id="ora_vip-instance_attributes-cidr_netmask"
name="cidr_netmask" value="24"/>
  </instance_attributes>
  <operations>
    <op id="ora_vip-start-interval-0s" interval="0s" name="start"
timeout="20s"/>
    <op id="ora_vip-stop-interval-0s" interval="0s" name="stop"
timeout="20s"/>
    <op id="ora_vip-monitor-interval-10s" interval="10s" name="monitor"
timeout="20s"/>
  </operations>
</primitive>

thanks a lot.

On Tue, May 2, 2017 at 9:17 PM, Klaus Wenninger <kwenning at redhat.com> wrote:

> On 05/02/2017 02:57 PM, Ken Gaillot wrote:
> > Hi,
> >
> > Upstream documentation on fencing in Pacemaker is available at:
> >
> > http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-
> single/Pacemaker_Explained/index.html#idm139683949958512
> >
> > Higher-level tools such as crm shell and pcs make it easier; see their
> > man pages and other documentation for details.
> >
> >
> > On 05/01/2017 10:35 PM, Albert Weng wrote:
> >> Hi All,
> >>
> >> My environment :
> >> (1) two node (active/passive) pacemaker cluster
> >> (2) SAN storage attached, add resource type "filesystem"
> >> (3) OS : RHEL 7.2
> >>
> >> In old version of RHEL cluster, when attached SAN storage path lost(ex.
> >> filesystem fail),
> >> active node will trigger fence device to reboot itself.
> >>
> >> but when i use pacemaker on RHEL cluster, when i remove fiber cable on
> >> active node, all resources failover to passive node normally, but active
> >> node doesn't reboot.
>
> That is the default on-fail behavior of pacemaker-operations (==restart -
> either on the node itself or another node - except for stop where it is
> fence).
> Using the on-fail behavior fence as well for start & monitor should give
> you the desired behavior as I got it from your description.
>
> Regards,
> Klaus
>
> >>
> >> how to trigger fence reboot action when SAN filesystem lost?
> >>
> >> Thank a lot~~~
> >>
> >>
> >> --
> >> Kind regards,
> >> Albert Weng
> >>
> >> <https://www.avast.com/sig-email?utm_medium=email&utm_
> source=link&utm_campaign=sig-email&utm_content=webmail>
> >>      不含病毒。www.avast.com
> > _______________________________________________
> > Users mailing list: Users at clusterlabs.org
> > http://lists.clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>

-- 
Kind regards,
Albert Weng
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20170503/97ef53e3/attachment-0002.html>