[ClusterLabs] Unfencing cause resource restarts

Pavel Levshin lpk at 581.spb.su
Tue Oct 11 12:06:07 UTC 2016


Hi!


In continuation of prevoius mails, now I have more complex setup. Our 
hardware are capable of two STONITH methods: ILO and SCSI persistent 
reservations on shared storage. First method works fine, nevertheless, 
sometimes in the past we faced problems with inaccessible ILO devices or 
something... So, we would like to have SCSI fencing as an additional method.

The problem: when a node 2 recovers, some resources are just stopped and 
restarted on node 1. As far as I understand, primitive resources are 
affected, but clone instances are not affected.

In the example below, when bvnode2 recovers, vm_smartbv1 is restarted on 
bvnode1, and vm_smartbv2 is live-migrated without interruption to 
bvnode2. All other resources are clones working on bvnode1 and they are 
unaffected.

If I set "meta requires=fencing" for vm resources, they are not 
restarted anymore. But why unfencing of bvnode2 affects resources 
running on bvnode1?


====

Current cluster status:

Online: [ bvnode1 bvnode2 ]

ilo.bvnode2 (stonith:fence_ilo4):   Started bvnode1

ilo.bvnode1 (stonith:fence_ilo4):   Stopped

Clone Set: dlm-clone [dlm]

      Started: [ bvnode1 ]

      Stopped: [ bvnode2 ]

Clone Set: clvmd-clone [clvmd]

      Started: [ bvnode1 ]

      Stopped: [ bvnode2 ]

Clone Set: cluster-config-clone [cluster-config]

      Started: [ bvnode1 ]

      Stopped: [ bvnode2 ]

vm_smartbv1 (ocf::heartbeat:VirtualDomain): Started bvnode1

vm_smartbv2 (ocf::heartbeat:VirtualDomain): Started bvnode1

Clone Set: libvirtd-clone [libvirtd]

      Started: [ bvnode1 ]

      Stopped: [ bvnode2 ]

storage.bvnode1 (stonith:fence_mpath):  Started bvnode1

storage.bvnode2 (stonith:fence_mpath):  Started bvnode1

Transition Summary:

* Start   ilo.bvnode1 (bvnode2)

* Start   dlm:1 (bvnode2)

* Start   clvmd:1 (bvnode2)

* Start cluster-config:1     (bvnode2)

* Restart vm_smartbv1 (Started bvnode1)

* Migrate vm_smartbv2 (Started bvnode1 -> bvnode2)

* Start libvirtd:1   (bvnode2)

* Move    storage.bvnode2      (Started bvnode1 -> bvnode2)
====

Cluster config:

====
Cluster Name: smartbvcluster

Corosync Nodes:

bvnode1 bvnode2

Pacemaker Nodes:

bvnode1 bvnode2

Resources:

Clone: dlm-clone

   Meta Attrs: interleave=true ordered=true

   Resource: dlm (class=ocf provider=pacemaker type=controld)

    Operations: start interval=0s timeout=90 (dlm-start-interval-0s)

                stop interval=0s timeout=100 (dlm-stop-interval-0s)

                monitor interval=30s (dlm-monitor-interval-30s)

Clone: clvmd-clone

   Meta Attrs: interleave=true ordered=true

   Resource: clvmd (class=ocf provider=heartbeat type=clvm)

    Operations: start interval=0s timeout=90 (clvmd-start-interval-0s)

                stop interval=0s timeout=90 (clvmd-stop-interval-0s)

                monitor interval=30s (clvmd-monitor-interval-30s)

Clone: cluster-config-clone

   Meta Attrs: interleave=true

   Resource: cluster-config (class=ocf provider=heartbeat type=Filesystem)

    Attributes: device=/dev/vg_bv_shared/cluster-config 
directory=/opt/cluster-config fstype=gfs2 options=noatime

    Operations: start interval=0s timeout=60 
(cluster-config-start-interval-0s)

                stop interval=0s timeout=60 
(cluster-config-stop-interval-0s)

                monitor interval=10s on-fail=fence OCF_CHECK_LEVEL=20 
(cluster-config-monitor-interval-10s)

Resource: vm_smartbv1 (class=ocf provider=heartbeat type=VirtualDomain)

   Attributes: config=/opt/cluster-config/libvirt/qemu/smartbv1.xml 
hypervisor=qemu:///system migration_transport=tcp

   Meta Attrs: allow-migrate=true

   Operations: start interval=0s timeout=90 (vm_smartbv1-start-interval-0s)

               stop interval=0s timeout=90 (vm_smartbv1-stop-interval-0s)

               monitor interval=10 timeout=30 
(vm_smartbv1-monitor-interval-10)

Resource: vm_smartbv2 (class=ocf provider=heartbeat type=VirtualDomain)

   Attributes: config=/opt/cluster-config/libvirt/qemu/smartbv2.xml 
hypervisor=qemu:///system migration_transport=tcp

   Meta Attrs: target-role=started allow-migrate=true

   Operations: start interval=0s timeout=90 (vm_smartbv2-start-interval-0s)

               stop interval=0s timeout=90 (vm_smartbv2-stop-interval-0s)

               monitor interval=10 timeout=30 
(vm_smartbv2-monitor-interval-10)

Clone: libvirtd-clone

   Meta Attrs: interleave=true

   Resource: libvirtd (class=systemd type=libvirtd)

    Operations: monitor interval=60s (libvirtd-monitor-interval-60s)

Stonith Devices:

Resource: ilo.bvnode2 (class=stonith type=fence_ilo4)

   Attributes: ipaddr=ilo.bvnode2 login=hacluster passwd=s 
pcmk_host_list=bvnode2 privlvl=operator

   Operations: monitor interval=60s (ilo.bvnode2-monitor-interval-60s)

Resource: ilo.bvnode1 (class=stonith type=fence_ilo4)

   Attributes: ipaddr=ilo.bvnode1 login=hacluster passwd=s 
pcmk_host_list=bvnode1 privlvl=operator

   Operations: monitor interval=60s (ilo.bvnode1-monitor-interval-60s)

Resource: storage.bvnode1 (class=stonith type=fence_mpath)

   Attributes: key=ab2ee06 pcmk_reboot_action=off 
devices=/dev/mapper/mpatha pcmk_host_check=static-list 
pcmk_host_list=bvnode1

   Meta Attrs: provides=unfencing

   Operations: monitor interval=60s (storage.bvnode1-monitor-interval-60s)

Resource: storage.bvnode2 (class=stonith type=fence_mpath)

   Attributes: key=ab2ee07 pcmk_reboot_action=off 
devices=/dev/mapper/mpatha pcmk_host_check=static-list 
pcmk_host_list=bvnode2

   Meta Attrs: provides=unfencing

   Operations: monitor interval=60s (storage.bvnode2-monitor-interval-60s)

Fencing Levels:

Node: bvnode1

   Level 10 - ilo.bvnode1

Node: bvnode2

   Level 10 - ilo.bvnode2

Location Constraints:

   Resource: ilo.bvnode1

     Disabled on: bvnode1 (score:-INFINITY) 
(id:location-ilo.bvnode1-bvnode1--INFINITY)

   Resource: ilo.bvnode2

     Disabled on: bvnode2 (score:-INFINITY) 
(id:location-ilo.bvnode2-bvnode2--INFINITY)

Ordering Constraints:

   start dlm-clone then start clvmd-clone (kind:Mandatory) 
(id:order-dlm-clone-clvmd-clone-mandatory)

   start clvmd-clone then start cluster-config-clone (kind:Mandatory) 
(id:order-clvmd-clone-cluster-config-clone-mandatory)

   start cluster-config-clone then start libvirtd-clone (kind:Mandatory) 
(id:order-cluster-config-clone-libvirtd-clone-mandatory)

   stop vm_smartbv2 then stop libvirtd-clone (kind:Mandatory) 
(non-symmetrical) (id:order-vm_smartbv2-libvirtd-clone-mandatory)

   stop vm_smartbv1 then stop libvirtd-clone (kind:Mandatory) 
(non-symmetrical) (id:order-vm_smartbv1-libvirtd-clone-mandatory)

   start libvirtd-clone then start vm_smartbv2 (kind:Optional) 
(non-symmetrical) (id:order-libvirtd-clone-vm_smartbv2-Optional)

   start libvirtd-clone then start vm_smartbv1 (kind:Optional) 
(non-symmetrical) (id:order-libvirtd-clone-vm_smartbv1-Optional)

Colocation Constraints:

   clvmd-clone with dlm-clone (score:INFINITY) 
(id:colocation-clvmd-clone-dlm-clone-INFINITY)

   cluster-config-clone with clvmd-clone (score:INFINITY) 
(id:colocation-cluster-config-clone-clvmd-clone-INFINITY)

   libvirtd-clone with cluster-config-clone (score:INFINITY) 
(id:colocation-libvirtd-clone-cluster-config-clone-INFINITY)

   vm_smartbv1 with libvirtd-clone (score:INFINITY) 
(id:colocation-vm_smartbv1-libvirtd-clone-INFINITY)

   vm_smartbv2 with libvirtd-clone (score:INFINITY) 
(id:colocation-vm_smartbv2-libvirtd-clone-INFINITY)

Resources Defaults:

No defaults set

Operations Defaults:

No defaults set

Cluster Properties:

cluster-infrastructure: corosync

cluster-name: smartbvcluster

dc-version: 1.1.13-10.el7_2.4-44eb2dd

have-watchdog: false

last-lrm-refresh: 1476099872

maintenance-mode: false

no-quorum-policy: freeze

start-failure-is-fatal: false

stonith-enabled: true

====

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20161011/51cea377/attachment-0003.html>


More information about the Users mailing list