[ClusterLabs] resource on remote node not failing over

Simon Lawrence simon.lawrence at navaho.co.uk
Wed Mar 11 11:13:31 UTC 2015


Hi

I have a two node cluster (Centos6.6/Pacemaker 1.1.12/Cman/Corosync 1.4.7)

It is an asymmetric cluster (symmetric-cluster: false)

There are two VM's (running Centos) in the cluster acting as remote 
nodes (lx16mx & lx17mx). These are on drbd backing storage.
Other VM's (running other services) will be added in due course.

Bind named service is installed/configured on both remote nodes, will 
start up on both nodes, and is being managed by an RA (ocf:heartbeat:named).
I have location constraints to allow the resource to run on these nodes 
and I can move this resource between these nodes.

However, if I disable the named service on the live node (by renaming 
the main config file and shutting down service), the resource does not 
failover to the other remote node.

As an aside, is there any difference in the effect of these two commands:
     pcs constraint location add location_drbd-lx16-node1 drbd-lx16 
ukdenavhost1 500
     pcs constraint location drbd-lx16 rule  score=500 \#uname eq 
"ukdenavhost1"

pcs config and status attached. I can send logs if required.

Thanks.
-------------- next part --------------
Cluster Name: cluster1
Corosync Nodes:
 ukdenavhost1 ukdenavhost2 
Pacemaker Nodes:
 ukdenavhost1 ukdenavhost2 

Resources: 
 Master: drbd-lx16-clone
  Meta Attrs: master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true 
  Resource: drbd-lx16 (class=ocf provider=linbit type=drbd)
   Attributes: drbd_resource=lx16 
   Operations: start interval=0s timeout=240 (drbd-lx16-start-timeout-240)
               promote interval=0s timeout=90 (drbd-lx16-promote-timeout-90)
               demote interval=0s timeout=90 (drbd-lx16-demote-timeout-90)
               stop interval=0s timeout=100 (drbd-lx16-stop-timeout-100)
               monitor interval=60s (drbd-lx16-monitor-interval-60s)
 Resource: vm-lx16mx (class=ocf provider=heartbeat type=VirtualDomain)
  Attributes: hypervisor=qemu:///system config=/shared/xml/lx16mx.xml 
  Meta Attrs: allow-migration=true remote-node=lx16mx 
  Operations: start interval=0s timeout=90 (vm-lx16mx-start-timeout-90)
              stop interval=0s timeout=90 (vm-lx16mx-stop-timeout-90)
              monitor interval=30 (vm-lx16mx-monitor-interval-30)
 Master: drbd-lx17-clone
  Meta Attrs: master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true 
  Resource: drbd-lx17 (class=ocf provider=linbit type=drbd)
   Attributes: drbd_resource=lx17 
   Operations: start interval=0s timeout=240 (drbd-lx17-start-timeout-240)
               promote interval=0s timeout=90 (drbd-lx17-promote-timeout-90)
               demote interval=0s timeout=90 (drbd-lx17-demote-timeout-90)
               stop interval=0s timeout=100 (drbd-lx17-stop-timeout-100)
               monitor interval=60s (drbd-lx17-monitor-interval-60s)
 Resource: vm-lx17mx (class=ocf provider=heartbeat type=VirtualDomain)
  Attributes: hypervisor=qemu:///system config=/shared/xml/lx17mx.xml 
  Meta Attrs: remote-node=lx17mx allow-migration=true 
  Operations: start interval=0s timeout=90 (vm-lx17mx-start-timeout-90)
              stop interval=0s timeout=90 (vm-lx17mx-stop-timeout-90)
              monitor interval=30 (vm-lx17mx-monitor-interval-30)
 Resource: dns (class=ocf provider=heartbeat type=named)
  Attributes: monitor_request=localhost monitor_response=127.0.0.1 
  Meta Attrs: migration-threshold=3 failure-timeout=60 
  Operations: start interval=0s timeout=60 (dns-start-timeout-60)
              stop interval=0s timeout=60 (dns-stop-timeout-60)
              monitor on-fail=restart interval=60s (dns-monitor-on-fail-restart)
 Resource: IP-mx (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: ip=10.0.64.140 cidr_netmask=24 
  Operations: start interval=0s timeout=20s (IP-mx-start-timeout-20s)
              stop interval=0s timeout=20s (IP-mx-stop-timeout-20s)
              monitor on-fail=restart interval=30s (IP-mx-monitor-interval-30s)

Stonith Devices: 
Fencing Levels: 

Location Constraints:
  Resource: IP-mx
    Enabled on: lx16mx (score:50) (id:loc_1_IP-mx)
    Enabled on: lx17mx (score:50) (id:loc_2_IP-mx)
  Resource: dns
    Constraint: location-dns-1
      Rule: score=500  (id:location-dns-1-rule) 
        Expression: #uname eq lx17mx  (id:location-dns-1-rule-expr-1) 
    Constraint: location-dns
      Rule: score=500  (id:location-dns-rule) 
        Expression: #uname eq lx16mx  (id:location-dns-rule-expr-1) 
  Resource: drbd-lx16
    Constraint: location-drbd-lx16-1
      Rule: score=500  (id:location-drbd-lx16-1-rule) 
        Expression: #uname eq ukdenavhost2  (id:location-drbd-lx16-1-rule-expr-1) 
    Constraint: location-drbd-lx16
      Rule: score=500  (id:location-drbd-lx16-rule) 
        Expression: #uname eq ukdenavhost1  (id:location-drbd-lx16-rule-expr-1) 
  Resource: drbd-lx17
    Constraint: location-drbd-lx17
      Rule: score=500  (id:location-drbd-lx17-rule) 
        Expression: #uname eq ukdenavhost1  (id:location-drbd-lx17-rule-expr-1) 
    Constraint: location-drbd-lx17-1
      Rule: score=500  (id:location-drbd-lx17-1-rule) 
        Expression: #uname eq ukdenavhost2  (id:location-drbd-lx17-1-rule-expr-1) 
  Resource: vm-lx16mx
    Enabled on: ukdenavhost1 (score:50) (id:location-vm-lx16mx-ukdenavhost1-50)
    Enabled on: ukdenavhost2 (score:50) (id:loc-lx16mx)
  Resource: vm-lx17mx
    Enabled on: ukdenavhost1 (score:50) (id:location-vm-lx17mx-ukdenavhost1-50)
    Enabled on: ukdenavhost2 (score:50) (id:loc-lx17mx)
Ordering Constraints:
  promote drbd-lx16-clone then start vm-lx16mx (kind:Mandatory) (id:order-drbd-lx16-clone-vm-lx16mx-mandatory)
  promote drbd-lx17-clone then start vm-lx17mx (kind:Mandatory) (id:order-drbd-lx17-clone-vm-lx17mx-mandatory)
Colocation Constraints:
  vm-lx16mx with drbd-lx16-clone (score:INFINITY) (with-rsc-role:Master) (id:colocation-vm-lx16mx-drbd-lx16-clone-INFINITY)
  vm-lx17mx with drbd-lx17-clone (score:INFINITY) (with-rsc-role:Master) (id:colocation-vm-lx17mx-drbd-lx17-clone-INFINITY)
  IP-mx with dns (score:INFINITY) (id:colocation-IP-mx-dns-INFINITY)

Cluster Properties:
 cluster-infrastructure: cman
 dc-version: 1.1.11-97629de
 last-lrm-refresh: 1426069927
 maintenance-mode: false
 no-quorum-policy: ignore
 stonith-enabled: false
 symmetric-cluster: false
-------------- next part --------------
Cluster name: cluster1
Last updated: Wed Mar 11 11:03:39 2015
Last change: Wed Mar 11 10:49:36 2015
Stack: cman
Current DC: ukdenavhost2 - partition with quorum
Version: 1.1.11-97629de
4 Nodes configured
10 Resources configured


Online: [ ukdenavhost1 ukdenavhost2 ]
Containers: [ lx16mx:vm-lx16mx lx17mx:vm-lx17mx ]

Full list of resources:

 Master/Slave Set: drbd-lx16-clone [drbd-lx16]
     Masters: [ ukdenavhost2 ]
     Slaves: [ ukdenavhost1 ]
 vm-lx16mx	(ocf::heartbeat:VirtualDomain):	Started ukdenavhost2 
 Master/Slave Set: drbd-lx17-clone [drbd-lx17]
     Masters: [ ukdenavhost2 ]
     Slaves: [ ukdenavhost1 ]
 vm-lx17mx	(ocf::heartbeat:VirtualDomain):	Started ukdenavhost2 
 dns	(ocf::heartbeat:named):	FAILED lx16mx 
 IP-mx	(ocf::heartbeat:IPaddr2):	Started lx16mx 

Failed actions:
    dns_start_0 on lx16mx 'unknown error' (1): call=45547, status=complete, last-rc-change='Wed Mar 11 11:03:21 2015', queued=0ms, exec=74ms
    dns_start_0 on lx16mx 'unknown error' (1): call=45547, status=complete, last-rc-change='Wed Mar 11 11:03:21 2015', queued=0ms, exec=74ms




More information about the Users mailing list