[ClusterLabs] Auto-failback prevention

Thu Feb 20 11:15:38 EST 2020

Hi folks,

My test system consists of two nodes running CentOS 7 and DRBD. The  
Pacemaker configuration below results in a system with a preferred  
master, bd3c7, and a slave, bd4c7, between which a set of resources  
will fail over and later back, within about two seconds either way,  
depending on the availability of the master. So far so good.

However, now I'd like to prevent those automatic failbacks from  
happening. The two nodes are completely equal anyway and failbacks  
just double the time that the service is unavailable. Besides, if  
bd3c7 starts suffering from intermittent connectivity problems, any  
failbacks could just make things worse.

I have tried setting the meta option "resource-stickiness" to 100, but  
this seems to have no effect. I suspect it is due to the presence of  
"cli-prefer-drbd-master", but without this location constraint (which  
is created by the "pcs resource move" command) the resources just stop  
and never fail over when the node goes down.

So, how should the configuration below be altered to result only in  
failovers, while preventing failbacks from ever happening?

Thanks,

Jaap Winius

############# config ###################

[root at bd4c7 ~]# pcs resource create drbd ocf:linbit:drbd  
drbd_resource=r0 op monitor interval=60s ; \
     pcs resource master drbd master-max=1 master-node-max=1  
clone-max=2 clone-node-max=1 notify=true ; \
     pcs resource create mount Filesystem device="/dev/drbd0"  
directory="/data" fstype="ext4" ; \
     pcs constraint colocation add mount with drbd-master INFINITY  
with-rsc-role=Master ; \
     pcs constraint order promote drbd-master then mount ; \
     pcs resource create vip ocf:heartbeat:IPaddr2 ip=192.168.2.73  
cidr_netmask=24 op monitor interval=30s ; \
     pcs constraint colocation add vip with drbd-master INFINITY  
with-rsc-role=Master ; \
     pcs constraint order mount then vip ; \
     pcs resource create nfsd nfsserver nfs_shared_infodir=/data ; \
     pcs resource create nfscfg exportfs clientspec="192.168.2.55"  
options=rw,no_subtree_check,no_root_squash directory=/data fsid=0 ; \
     pcs constraint colocation add nfsd with vip ; \
     pcs constraint colocation add nfscfg with nfsd ; \
     pcs constraint order vip then nfsd ; \
     pcs constraint order nfsd then nfscfg ; \
     pcs resource move drbd-master --master bd3c7

[root at bd4c7 ~]# pcs config
Cluster Name: fscluster
Corosync Nodes:
   bd3c7 bd4c7
Pacemaker Nodes:
   bd3c7 bd4c7

Resources:
   Master: drbd-master
    Meta Attrs: clone-max=2 clone-node-max=1 master-max=1  
master-node-max=1 notify=true
    Resource: drbd (class=ocf provider=linbit type=drbd)
     Attributes: drbd_resource=r0
     Operations: demote interval=0s timeout=90 (drbd-demote-interval-0s)
                 monitor interval=60s (drbd-monitor-interval-60s)
                 notify interval=0s timeout=90 (drbd-notify-interval-0s)
                 promote interval=0s timeout=90 (drbd-promote-interval-0s)
                 reload interval=0s timeout=30 (drbd-reload-interval-0s)
                 start interval=0s timeout=240 (drbd-start-interval-0s)
                 stop interval=0s timeout=100 (drbd-stop-interval-0s)
   Resource: mount (class=ocf provider=heartbeat type=Filesystem)
    Attributes: device=/dev/drbd0 directory=/data fstype=ext4
    Operations: monitor interval=20s timeout=40s (mount-monitor-interval-20s)
                notify interval=0s timeout=60s (mount-notify-interval-0s)
                start interval=0s timeout=60s (mount-start-interval-0s)
                stop interval=0s timeout=60s (mount-stop-interval-0s)
   Resource: vip (class=ocf provider=heartbeat type=IPaddr2)
    Attributes: cidr_netmask=24 ip=192.168.2.73
    Operations: monitor interval=30s (vip-monitor-interval-30s)
                start interval=0s timeout=20s (vip-start-interval-0s)
                stop interval=0s timeout=20s (vip-stop-interval-0s)
   Resource: nfsd (class=ocf provider=heartbeat type=nfsserver)
    Attributes: nfs_shared_infodir=/data
    Operations: monitor interval=10s timeout=20s (nfsd-monitor-interval-10s)
                start interval=0s timeout=40s (nfsd-start-interval-0s)
                stop interval=0s timeout=20s (nfsd-stop-interval-0s)
   Resource: nfscfg (class=ocf provider=heartbeat type=exportfs)
    Attributes: clientspec=192.168.2.55 directory=/data fsid=0  
options=rw,no_subtree_check,no_root_squash
    Operations: monitor interval=10s timeout=20s (nfscfg-monitor-interval-10s)
                start interval=0s timeout=40s (nfscfg-start-interval-0s)
                stop interval=0s timeout=120s (nfscfg-stop-interval-0s)

Stonith Devices:
Fencing Levels:

Location Constraints:
    Resource: drbd-master
      Enabled on: bd3c7 (score:INFINITY) (role: Master)  
(id:cli-prefer-drbd-master)
Ordering Constraints:
    promote drbd-master then start mount (kind:Mandatory)  
(id:order-drbd-master-mount-mandatory)
    start mount then start vip (kind:Mandatory) (id:order-mount-vip-mandatory)
    start vip then start nfsd (kind:Mandatory) (id:order-vip-nfsd-mandatory)
    start nfsd then start nfscfg (kind:Mandatory)  
(id:order-nfsd-nfscfg-mandatory)
Colocation Constraints:
    mount with drbd-master (score:INFINITY) (with-rsc-role:Master)  
(id:colocation-mount-drbd-master-INFINITY)
    vip with drbd-master (score:INFINITY) (with-rsc-role:Master)  
(id:colocation-vip-drbd-master-INFINITY)
    nfsd with vip (score:INFINITY) (id:colocation-nfsd-vip-INFINITY)
    nfscfg with nfsd (score:INFINITY) (id:colocation-nfscfg-nfsd-INFINITY)
Ticket Constraints:

Alerts:
   No alerts defined

Resources Defaults:
   No defaults set
Operations Defaults:
   No defaults set

Cluster Properties:
   cluster-infrastructure: corosync
   cluster-name: fscluster
   dc-version: 1.1.20-5.el7_7.2-3c4c782f70
   have-watchdog: false
   no-quorum-policy: ignore
   stonith-enabled: false

Quorum:
    Options:

[root at bd4c7 ~]# pcs status
Cluster name: fscluster
Stack: corosync
Current DC: bd3c7 (version 1.1.20-5.el7_7.2-3c4c782f70) - partition  
with quorum
Last updated: Thu Feb 20 15:22:46 2020
Last change: Thu Feb 20 15:14:33 2020 by root via crm_resource on bd4c7

2 nodes configured
6 resources configured

Online: [ bd3c7 bd4c7 ]

Full list of resources:

   Master/Slave Set: drbd-master [drbd]
       Masters: [ bd3c7 ]
       Slaves: [ bd4c7 ]
   mount	(ocf::heartbeat:Filesystem):	Started bd3c7
   vip	(ocf::heartbeat:IPaddr2):	Started bd3c7
   nfsd	(ocf::heartbeat:nfsserver):	Started bd3c7
   nfscfg	(ocf::heartbeat:exportfs):	Started bd3c7

Failed Resource Actions:
* mount_monitor_20000 on bd4c7 'unknown error' (1): call=27,  
status=Error, exitreason='',
      last-rc-change='Thu Feb 20 15:14:25 2020', queued=0ms, exec=0ms

Daemon Status:
    corosync: active/disabled
    pacemaker: active/disabled
    pcsd: active/disabled

[root at bd4c7 ~]#

########################################