[ClusterLabs] Unfencing cause resource restarts

Ken Gaillot kgaillot at redhat.com
Tue Oct 11 10:40:52 EDT 2016


On 10/11/2016 07:06 AM, Pavel Levshin wrote:
> Hi!
> 
> 
> In continuation of prevoius mails, now I have more complex setup. Our
> hardware are capable of two STONITH methods: ILO and SCSI persistent
> reservations on shared storage. First method works fine, nevertheless,
> sometimes in the past we faced problems with inaccessible ILO devices or
> something... So, we would like to have SCSI fencing as an additional method.
> 
> The problem: when a node 2 recovers, some resources are just stopped and
> restarted on node 1. As far as I understand, primitive resources are
> affected, but clone instances are not affected.
> 
> In the example below, when bvnode2 recovers, vm_smartbv1 is restarted on
> bvnode1, and vm_smartbv2 is live-migrated without interruption to
> bvnode2. All other resources are clones working on bvnode1 and they are
> unaffected.
> 
> If I set "meta requires=fencing" for vm resources, they are not
> restarted anymore. But why unfencing of bvnode2 affects resources
> running on bvnode1?

That does seem odd.

Something I notice in the config below is that only the ILO devices are
listed in the fence topology, and the only fence level is "10". Valid
indexes are 1 to 9, so this should have produced a log error about "Bad
topology".

If you want the storage fencing as a fallback in case ILO fails, you
want the devices in two levels, e.g. level 1 = ILO, level 2 = storage.

> 
> ====
> 
> Current cluster status:
> 
> Online: [ bvnode1 bvnode2 ]
> 
>  
> 
> ilo.bvnode2    (stonith:fence_ilo4):   Started bvnode1
> 
> ilo.bvnode1    (stonith:fence_ilo4):   Stopped
> 
> Clone Set: dlm-clone [dlm]
> 
>      Started: [ bvnode1 ]
> 
>      Stopped: [ bvnode2 ]
> 
> Clone Set: clvmd-clone [clvmd]
> 
>      Started: [ bvnode1 ]
> 
>      Stopped: [ bvnode2 ]
> 
> Clone Set: cluster-config-clone [cluster-config]
> 
>      Started: [ bvnode1 ]
> 
>      Stopped: [ bvnode2 ]
> 
> vm_smartbv1    (ocf::heartbeat:VirtualDomain): Started bvnode1
> 
> vm_smartbv2    (ocf::heartbeat:VirtualDomain): Started bvnode1
> 
> Clone Set: libvirtd-clone [libvirtd]
> 
>      Started: [ bvnode1 ]
> 
>      Stopped: [ bvnode2 ]
> 
> storage.bvnode1        (stonith:fence_mpath):  Started bvnode1
> 
> storage.bvnode2        (stonith:fence_mpath):  Started bvnode1
> 
>  
> 
> Transition Summary:
> 
> * Start   ilo.bvnode1  (bvnode2)
> 
> * Start   dlm:1        (bvnode2)
> 
> * Start   clvmd:1      (bvnode2)
> 
> * Start   cluster-config:1     (bvnode2)
> 
> * Restart vm_smartbv1  (Started bvnode1)
> 
> * Migrate vm_smartbv2  (Started bvnode1 -> bvnode2)
> 
> * Start   libvirtd:1   (bvnode2)
> 
> * Move    storage.bvnode2      (Started bvnode1 -> bvnode2)
> ====
> 
> Cluster config:
> 
> ====
> Cluster Name: smartbvcluster
> 
> Corosync Nodes:
> 
> bvnode1 bvnode2
> 
> Pacemaker Nodes:
> 
> bvnode1 bvnode2
> 
>  
> 
> Resources:
> 
> Clone: dlm-clone
> 
>   Meta Attrs: interleave=true ordered=true
> 
>   Resource: dlm (class=ocf provider=pacemaker type=controld)
> 
>    Operations: start interval=0s timeout=90 (dlm-start-interval-0s)
> 
>                stop interval=0s timeout=100 (dlm-stop-interval-0s)
> 
>                monitor interval=30s (dlm-monitor-interval-30s)
> 
> Clone: clvmd-clone
> 
>   Meta Attrs: interleave=true ordered=true
> 
>   Resource: clvmd (class=ocf provider=heartbeat type=clvm)
> 
>    Operations: start interval=0s timeout=90 (clvmd-start-interval-0s)
> 
>                stop interval=0s timeout=90 (clvmd-stop-interval-0s)
> 
>                monitor interval=30s (clvmd-monitor-interval-30s)
> 
> Clone: cluster-config-clone
> 
>   Meta Attrs: interleave=true
> 
>   Resource: cluster-config (class=ocf provider=heartbeat type=Filesystem)
> 
>    Attributes: device=/dev/vg_bv_shared/cluster-config
> directory=/opt/cluster-config fstype=gfs2 options=noatime
> 
>    Operations: start interval=0s timeout=60
> (cluster-config-start-interval-0s)
> 
>                stop interval=0s timeout=60 (cluster-config-stop-interval-0s)
> 
>                monitor interval=10s on-fail=fence OCF_CHECK_LEVEL=20
> (cluster-config-monitor-interval-10s)
> 
> Resource: vm_smartbv1 (class=ocf provider=heartbeat type=VirtualDomain)
> 
>   Attributes: config=/opt/cluster-config/libvirt/qemu/smartbv1.xml
> hypervisor=qemu:///system migration_transport=tcp
> 
>   Meta Attrs: allow-migrate=true
> 
>   Operations: start interval=0s timeout=90 (vm_smartbv1-start-interval-0s)
> 
>               stop interval=0s timeout=90 (vm_smartbv1-stop-interval-0s)
> 
>               monitor interval=10 timeout=30
> (vm_smartbv1-monitor-interval-10)
> 
> Resource: vm_smartbv2 (class=ocf provider=heartbeat type=VirtualDomain)
> 
>   Attributes: config=/opt/cluster-config/libvirt/qemu/smartbv2.xml
> hypervisor=qemu:///system migration_transport=tcp
> 
>   Meta Attrs: target-role=started allow-migrate=true
> 
>   Operations: start interval=0s timeout=90 (vm_smartbv2-start-interval-0s)
> 
>               stop interval=0s timeout=90 (vm_smartbv2-stop-interval-0s)
> 
>               monitor interval=10 timeout=30
> (vm_smartbv2-monitor-interval-10)
> 
> Clone: libvirtd-clone
> 
>   Meta Attrs: interleave=true
> 
>   Resource: libvirtd (class=systemd type=libvirtd)
> 
>    Operations: monitor interval=60s (libvirtd-monitor-interval-60s)
> 
>  
> 
> Stonith Devices:
> 
> Resource: ilo.bvnode2 (class=stonith type=fence_ilo4)
> 
>   Attributes: ipaddr=ilo.bvnode2 login=hacluster passwd=s
> pcmk_host_list=bvnode2 privlvl=operator
> 
>   Operations: monitor interval=60s (ilo.bvnode2-monitor-interval-60s)
> 
> Resource: ilo.bvnode1 (class=stonith type=fence_ilo4)
> 
>   Attributes: ipaddr=ilo.bvnode1 login=hacluster passwd=s
> pcmk_host_list=bvnode1 privlvl=operator
> 
>   Operations: monitor interval=60s (ilo.bvnode1-monitor-interval-60s)
> 
> Resource: storage.bvnode1 (class=stonith type=fence_mpath)
> 
>   Attributes: key=ab2ee06 pcmk_reboot_action=off
> devices=/dev/mapper/mpatha pcmk_host_check=static-list
> pcmk_host_list=bvnode1
> 
>   Meta Attrs: provides=unfencing
> 
>   Operations: monitor interval=60s (storage.bvnode1-monitor-interval-60s)
> 
> Resource: storage.bvnode2 (class=stonith type=fence_mpath)
> 
>   Attributes: key=ab2ee07 pcmk_reboot_action=off
> devices=/dev/mapper/mpatha pcmk_host_check=static-list
> pcmk_host_list=bvnode2
> 
>   Meta Attrs: provides=unfencing
> 
>   Operations: monitor interval=60s (storage.bvnode2-monitor-interval-60s)
> 
> Fencing Levels:
> 
>  
> 
> Node: bvnode1
> 
>   Level 10 - ilo.bvnode1
> 
> Node: bvnode2
> 
>   Level 10 - ilo.bvnode2
> 
> Location Constraints:
> 
>   Resource: ilo.bvnode1
> 
>     Disabled on: bvnode1 (score:-INFINITY)
> (id:location-ilo.bvnode1-bvnode1--INFINITY)
> 
>   Resource: ilo.bvnode2
> 
>     Disabled on: bvnode2 (score:-INFINITY)
> (id:location-ilo.bvnode2-bvnode2--INFINITY)
> 
> Ordering Constraints:
> 
>   start dlm-clone then start clvmd-clone (kind:Mandatory)
> (id:order-dlm-clone-clvmd-clone-mandatory)
> 
>   start clvmd-clone then start cluster-config-clone (kind:Mandatory)
> (id:order-clvmd-clone-cluster-config-clone-mandatory)
> 
>   start cluster-config-clone then start libvirtd-clone (kind:Mandatory)
> (id:order-cluster-config-clone-libvirtd-clone-mandatory)
> 
>   stop vm_smartbv2 then stop libvirtd-clone (kind:Mandatory)
> (non-symmetrical) (id:order-vm_smartbv2-libvirtd-clone-mandatory)
> 
>   stop vm_smartbv1 then stop libvirtd-clone (kind:Mandatory)
> (non-symmetrical) (id:order-vm_smartbv1-libvirtd-clone-mandatory)
> 
>   start libvirtd-clone then start vm_smartbv2 (kind:Optional)
> (non-symmetrical) (id:order-libvirtd-clone-vm_smartbv2-Optional)
> 
>   start libvirtd-clone then start vm_smartbv1 (kind:Optional)
> (non-symmetrical) (id:order-libvirtd-clone-vm_smartbv1-Optional)
> 
> Colocation Constraints:
> 
>   clvmd-clone with dlm-clone (score:INFINITY)
> (id:colocation-clvmd-clone-dlm-clone-INFINITY)
> 
>   cluster-config-clone with clvmd-clone (score:INFINITY)
> (id:colocation-cluster-config-clone-clvmd-clone-INFINITY)
> 
>   libvirtd-clone with cluster-config-clone (score:INFINITY)
> (id:colocation-libvirtd-clone-cluster-config-clone-INFINITY)
> 
>   vm_smartbv1 with libvirtd-clone (score:INFINITY)
> (id:colocation-vm_smartbv1-libvirtd-clone-INFINITY)
> 
>   vm_smartbv2 with libvirtd-clone (score:INFINITY)
> (id:colocation-vm_smartbv2-libvirtd-clone-INFINITY)
> 
>  
> 
> Resources Defaults:
> 
> No defaults set
> 
> Operations Defaults:
> 
> No defaults set
> 
>  
> 
> Cluster Properties:
> 
> cluster-infrastructure: corosync
> 
> cluster-name: smartbvcluster
> 
> dc-version: 1.1.13-10.el7_2.4-44eb2dd
> 
> have-watchdog: false
> 
> last-lrm-refresh: 1476099872
> 
> maintenance-mode: false
> 
> no-quorum-policy: freeze
> 
> start-failure-is-fatal: false
> 
> stonith-enabled: true




More information about the Users mailing list