[ClusterLabs] Resources always return to original node

Sat Sep 26 09:26:52 EDT 2020

Resource Stickiness for a group is the sum of all resources' resource stikiness -> 5 resources x 100 score (default stickiness) = 500 score.
If your location constraint has a bigger number -> it wins :)

Best Regards,
Strahil Nikolov

В събота, 26 септември 2020 г., 12:22:32 Гринуич+3, Michael Ivanov <ivans at logit-ag.de> написа: 

Hallo,

I have strange problem: when I reset the node on which my resources are running, they are correctly migrated to the other node. But when I turn the failed node back, then as soon as it is up all resources are returned back to it. I have set resource-stickiness default value to 100. When this did not help I have set up resource-stickiness meta attr also to 100 for all my resources. Still when the failed node recovers the resources are migrated back to it! Where should I look to try to understand this situation?

Here's the configuration of my cluster:

root at node1# pcs status
Cluster name: gcluster
Cluster Summary:
  * Stack: corosync
  * Current DC: node1 (version 2.0.4-2deceaa3ae) - partition with quorum
  * Last updated: Sat Sep 26 11:12:34 2020
  * Last change:  Sat Sep 26 10:39:16 2020 by root via cibadmin on node1
  * 2 nodes configured
  * 14 resource instances configured (1 DISABLED)

Node List:
  * Online: [ node1 node2 ]

Full List of Resources:
  * ilo5_node1    (stonith:fence_ilo5_ssh):     Started node2
  * ilo5_node2    (stonith:fence_ilo5_ssh):     Started node1
  * Resource Group: VirtIP:
    * PrimaryIP    (ocf::heartbeat:IPaddr2):     Started node2
    * PrimaryIP6    (ocf::heartbeat:IPv6addr):     Started node2
    * AliasIP    (ocf::heartbeat:IPaddr2):     Started node2
  * BackupFS    (ocf::redhat:netfs.sh):     Started node2
  * Clone Set: MailVolume-clone [MailVolume] (promotable):
    * Masters: [ node2 ]
    * Slaves: [ node1 ]
  * MailFS    (ocf::heartbeat:Filesystem):     Started node2
  * apache    (ocf::heartbeat:apache):     Started node2
  * postfix    (ocf::heartbeat:postfix):     Started node2
  * amavis    (service:amavis):     Started node2
  * dovecot    (service:dovecot):     Started node2
  * openvpn    (service:openvpn):     Stopped (disabled)

And resources:

root at node1# pcs resource config
 Group: VirtIP
  Meta Attrs: resource-stickiness=100
  Resource: PrimaryIP (class=ocf provider=heartbeat type=IPaddr2)
   Attributes: cidr_netmask=16 ip=xx.xx.xx.20 nic=br0
   Meta Attrs: resource-stickiness=100
   Operations: monitor interval=30s (PrimaryIP-monitor-interval-30s)
               start interval=0s timeout=20s (PrimaryIP-start-interval-0s)
               stop interval=0s timeout=20s (PrimaryIP-stop-interval-0s)
  Resource: PrimaryIP6 (class=ocf provider=heartbeat type=IPv6addr)
   Attributes: cidr_netmask=64 ipv6addr=xxxx:xxxx:xxxx:xxxx:0:0:0:20 nic=br0
   Meta Attrs: resource-stickiness=100
   Operations: monitor interval=30s (PrimaryIP6-monitor-interval-30s)
               start interval=0s timeout=15s (PrimaryIP6-start-interval-0s)
               stop interval=0s timeout=15s (PrimaryIP6-stop-interval-0s)
  Resource: AliasIP (class=ocf provider=heartbeat type=IPaddr2)
   Attributes: cidr_netmask=16 ip=xx.xx.yy.20 nic=br0
   Meta Attrs: resource-stickiness=100
   Operations: monitor interval=30s (AliasIP-monitor-interval-30s)
               start interval=0s timeout=20s (AliasIP-start-interval-0s)
               stop interval=0s timeout=20s (AliasIP-stop-interval-0s)
 Resource: BackupFS (class=ocf provider=redhat type=netfs.sh)
  Attributes: export=/Backup/Gateway fstype=nfs host=atlas mountpoint=/Backup options=noatime,async
  Meta Attrs: resource-stickiness=100
  Operations: monitor interval=1m timeout=10 (BackupFS-monitor-interval-1m)
              monitor interval=5m timeout=30 OCF_CHECK_LEVEL=10 (BackupFS-monitor-interval-5m)
              monitor interval=10m timeout=30 OCF_CHECK_LEVEL=20 (BackupFS-monitor-interval-10m)
              start interval=0s timeout=900 (BackupFS-start-interval-0s)
              stop interval=0s timeout=30 (BackupFS-stop-interval-0s)
 Clone: MailVolume-clone
  Meta Attrs: clone-max=2 clone-node-max=1 notify=true promotable=true promoted-max=1 promoted-node-max=1 resource-stickiness=100
  Resource: MailVolume (class=ocf provider=linbit type=drbd)
   Attributes: drbd_resource=mail
   Meta Attrs: resource-stickiness=100
   Operations: demote interval=0s timeout=90 (MailVolume-demote-interval-0s)
               monitor interval=60s (MailVolume-monitor-interval-60s)
               notify interval=0s timeout=90 (MailVolume-notify-interval-0s)
               promote interval=0s timeout=90 (MailVolume-promote-interval-0s)
               reload interval=0s timeout=30 (MailVolume-reload-interval-0s)
               start interval=0s timeout=240 (MailVolume-start-interval-0s)
               stop interval=0s timeout=100 (MailVolume-stop-interval-0s)
 Resource: MailFS (class=ocf provider=heartbeat type=Filesystem)
  Attributes: device=/dev/drbd0 directory=/var/mail fstype=btrfs
  Meta Attrs: resource-stickiness=100
  Operations: monitor interval=20s timeout=40s (MailFS-monitor-interval-20s)
              start interval=0s timeout=60s (MailFS-start-interval-0s)
              stop interval=0s timeout=60s (MailFS-stop-interval-0s)
 Resource: apache (class=ocf provider=heartbeat type=apache)
  Attributes: client=wget statusurl=https://localhost/server-status
  Meta Attrs: resource-stickiness=100
  Operations: monitor interval=1min (apache-monitor-interval-1min)
              start interval=0s timeout=40s (apache-start-interval-0s)
              stop interval=0s timeout=60s (apache-stop-interval-0s)
 Resource: postfix (class=ocf provider=heartbeat type=postfix)
  Meta Attrs: resource-stickiness=100
  Operations: monitor interval=60s timeout=20s (postfix-monitor-interval-60s)
              reload interval=0s timeout=20s (postfix-reload-interval-0s)
              start interval=0s timeout=20s (postfix-start-interval-0s)
              stop interval=0s timeout=20s (postfix-stop-interval-0s)
 Resource: amavis (class=service type=amavis)
  Meta Attrs: resource-stickiness=100
  Operations: force-reload interval=0s timeout=15 (amavis-force-reload-interval-0s)
              monitor interval=15 timeout=15 (amavis-monitor-interval-15)
              restart interval=0s timeout=15 (amavis-restart-interval-0s)
              start interval=0s timeout=15 (amavis-start-interval-0s)
              stop interval=0s timeout=15 (amavis-stop-interval-0s)
 Resource: dovecot (class=service type=dovecot)
  Meta Attrs: resource-stickiness=100
  Operations: force-reload interval=0s timeout=15 (dovecot-force-reload-interval-0s)
              monitor interval=15 timeout=15 (dovecot-monitor-interval-15)
              restart interval=0s timeout=15 (dovecot-restart-interval-0s)
              start interval=0s timeout=15 (dovecot-start-interval-0s)
              stop interval=0s timeout=15 (dovecot-stop-interval-0s)
 Resource: openvpn (class=service type=openvpn)
  Meta Attrs: resource-stickiness=100 target-role=Stopped
  Operations: force-reload interval=0s timeout=15 (openvpn-force-reload-interval-0s)
              monitor interval=15 timeout=15 (openvpn-monitor-interval-15)
              restart interval=0s timeout=15 (openvpn-restart-interval-0s)
              start interval=0s timeout=15 (openvpn-start-interval-0s)
              stop interval=0s timeout=15 (openvpn-stop-interval-0s)

drbd resource is configured as follows:

root at node1# cat /etc/drbd.d/mail.res 
resource mail {
  protocol  B;
  device    /dev/drbd0;
  disk      /dev/sys/mail;
  meta-disk internal;

  net {
    csums-alg sha1;
    after-sb-0pri discard-zero-changes;
    after-sb-1pri discard-secondary;
    after-sb-2pri disconnect;
    rr-conflict disconnect;
  }

  handlers {
    fence-peer            "/usr/lib/drbd/crm-fence-peer.sh";
    after-resync-target   "/usr/lib/drbd/crm-unfence-peer.sh";
    split-brain           "/usr/lib/drbd/notify-split-brain.sh admin at logit-ag.de";
  }

  on node1 {
    address 192.168.0.102:7789;
  }
  on node2 {
    address 192.168.0.103:7789;
  }
}

Best regards,

-- 
\   / |                       |
(OvO) |  Mikhail Iwanow                   |
(^^^) |                                   |
  \^/  |      E-mail:  ivans at logit-ag.de   |
  ^ ^  |                                   |

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/