[ClusterLabs] Resources always return to original node

Sat Sep 26 08:44:18 EDT 2020

26.09.2020 12:22, Michael Ivanov пишет:
> Hallo,
> 
> I have strange problem: when I reset the node on which my resources are running, 
> they are correctly migrated to the other node. But when I turn the failed node 
> back, then as soon as it is up all resources are returned back to it. I have set 
> resource-stickiness default value to 100. When this did not help I have set up 
> resource-stickiness meta attr also to 100 for all my resources. Still when the 
> failed node recovers the resources are migrated back to it! Where should I look 
> to try to understand this situation?
> 

The first thing to check are location and colocation constraints.

> Here's the configuration of my cluster:
> 
> root at node1# pcs status
> Cluster name: gcluster
> Cluster Summary:
>    * Stack: corosync
>    * Current DC: node1 (version 2.0.4-2deceaa3ae) - partition with quorum
>    * Last updated: Sat Sep 26 11:12:34 2020
>    * Last change:  Sat Sep 26 10:39:16 2020 by root via cibadmin on node1
>    * 2 nodes configured
>    * 14 resource instances configured (1 DISABLED)
> 
> Node List:
>    * Online: [ node1 node2 ]
> 
> Full List of Resources:
>    * ilo5_node1    (stonith:fence_ilo5_ssh):     Started node2
>    * ilo5_node2    (stonith:fence_ilo5_ssh):     Started node1
>    * Resource Group: VirtIP:
>      * PrimaryIP    (ocf::heartbeat:IPaddr2):     Started node2
>      * PrimaryIP6    (ocf::heartbeat:IPv6addr):     Started node2
>      * AliasIP    (ocf::heartbeat:IPaddr2):     Started node2
>    * BackupFS    (ocf::redhat:netfs.sh):     Started node2
>    * Clone Set: MailVolume-clone [MailVolume] (promotable):
>      * Masters: [ node2 ]
>      * Slaves: [ node1 ]
>    * MailFS    (ocf::heartbeat:Filesystem):     Started node2
>    * apache    (ocf::heartbeat:apache):     Started node2
>    * postfix    (ocf::heartbeat:postfix):     Started node2
>    * amavis    (service:amavis):     Started node2
>    * dovecot    (service:dovecot):     Started node2
>    * openvpn    (service:openvpn):     Stopped (disabled)
> 
> And resources:
> 
> root at node1# pcs resource config
>   Group: VirtIP
>    Meta Attrs: resource-stickiness=100
>    Resource: PrimaryIP (class=ocf provider=heartbeat type=IPaddr2)
>     Attributes: cidr_netmask=16 ip=xx.xx.xx.20 nic=br0
>     Meta Attrs: resource-stickiness=100
>     Operations: monitor interval=30s (PrimaryIP-monitor-interval-30s)
>                 start interval=0s timeout=20s (PrimaryIP-start-interval-0s)
>                 stop interval=0s timeout=20s (PrimaryIP-stop-interval-0s)
>    Resource: PrimaryIP6 (class=ocf provider=heartbeat type=IPv6addr)
>     Attributes: cidr_netmask=64 ipv6addr=xxxx:xxxx:xxxx:xxxx:0:0:0:20 nic=br0
>     Meta Attrs: resource-stickiness=100
>     Operations: monitor interval=30s (PrimaryIP6-monitor-interval-30s)
>                 start interval=0s timeout=15s (PrimaryIP6-start-interval-0s)
>                 stop interval=0s timeout=15s (PrimaryIP6-stop-interval-0s)
>    Resource: AliasIP (class=ocf provider=heartbeat type=IPaddr2)
>     Attributes: cidr_netmask=16 ip=xx.xx.yy.20 nic=br0
>     Meta Attrs: resource-stickiness=100
>     Operations: monitor interval=30s (AliasIP-monitor-interval-30s)
>                 start interval=0s timeout=20s (AliasIP-start-interval-0s)
>                 stop interval=0s timeout=20s (AliasIP-stop-interval-0s)
>   Resource: BackupFS (class=ocf provider=redhat type=netfs.sh)
>    Attributes: export=/Backup/Gateway fstype=nfs host=atlas mountpoint=/Backup 
> options=noatime,async
>    Meta Attrs: resource-stickiness=100
>    Operations: monitor interval=1m timeout=10 (BackupFS-monitor-interval-1m)
>                monitor interval=5m timeout=30 OCF_CHECK_LEVEL=10 
> (BackupFS-monitor-interval-5m)
>                monitor interval=10m timeout=30 OCF_CHECK_LEVEL=20 
> (BackupFS-monitor-interval-10m)
>                start interval=0s timeout=900 (BackupFS-start-interval-0s)
>                stop interval=0s timeout=30 (BackupFS-stop-interval-0s)
>   Clone: MailVolume-clone
>    Meta Attrs: clone-max=2 clone-node-max=1 notify=true promotable=true 
> promoted-max=1 promoted-node-max=1 resource-stickiness=100
>    Resource: MailVolume (class=ocf provider=linbit type=drbd)
>     Attributes: drbd_resource=mail
>     Meta Attrs: resource-stickiness=100
>     Operations: demote interval=0s timeout=90 (MailVolume-demote-interval-0s)
>                 monitor interval=60s (MailVolume-monitor-interval-60s)
>                 notify interval=0s timeout=90 (MailVolume-notify-interval-0s)
>                 promote interval=0s timeout=90 (MailVolume-promote-interval-0s)
>                 reload interval=0s timeout=30 (MailVolume-reload-interval-0s)
>                 start interval=0s timeout=240 (MailVolume-start-interval-0s)
>                 stop interval=0s timeout=100 (MailVolume-stop-interval-0s)
>   Resource: MailFS (class=ocf provider=heartbeat type=Filesystem)
>    Attributes: device=/dev/drbd0 directory=/var/mail fstype=btrfs
>    Meta Attrs: resource-stickiness=100
>    Operations: monitor interval=20s timeout=40s (MailFS-monitor-interval-20s)
>                start interval=0s timeout=60s (MailFS-start-interval-0s)
>                stop interval=0s timeout=60s (MailFS-stop-interval-0s)
>   Resource: apache (class=ocf provider=heartbeat type=apache)
>    Attributes: client=wget statusurl=https://localhost/server-status
>    Meta Attrs: resource-stickiness=100
>    Operations: monitor interval=1min (apache-monitor-interval-1min)
>                start interval=0s timeout=40s (apache-start-interval-0s)
>                stop interval=0s timeout=60s (apache-stop-interval-0s)
>   Resource: postfix (class=ocf provider=heartbeat type=postfix)
>    Meta Attrs: resource-stickiness=100
>    Operations: monitor interval=60s timeout=20s (postfix-monitor-interval-60s)
>                reload interval=0s timeout=20s (postfix-reload-interval-0s)
>                start interval=0s timeout=20s (postfix-start-interval-0s)
>                stop interval=0s timeout=20s (postfix-stop-interval-0s)
>   Resource: amavis (class=service type=amavis)
>    Meta Attrs: resource-stickiness=100
>    Operations: force-reload interval=0s timeout=15 (amavis-force-reload-interval-0s)
>                monitor interval=15 timeout=15 (amavis-monitor-interval-15)
>                restart interval=0s timeout=15 (amavis-restart-interval-0s)
>                start interval=0s timeout=15 (amavis-start-interval-0s)
>                stop interval=0s timeout=15 (amavis-stop-interval-0s)
>   Resource: dovecot (class=service type=dovecot)
>    Meta Attrs: resource-stickiness=100
>    Operations: force-reload interval=0s timeout=15 
> (dovecot-force-reload-interval-0s)
>                monitor interval=15 timeout=15 (dovecot-monitor-interval-15)
>                restart interval=0s timeout=15 (dovecot-restart-interval-0s)
>                start interval=0s timeout=15 (dovecot-start-interval-0s)
>                stop interval=0s timeout=15 (dovecot-stop-interval-0s)
>   Resource: openvpn (class=service type=openvpn)
>    Meta Attrs: resource-stickiness=100 target-role=Stopped
>    Operations: force-reload interval=0s timeout=15 
> (openvpn-force-reload-interval-0s)
>                monitor interval=15 timeout=15 (openvpn-monitor-interval-15)
>                restart interval=0s timeout=15 (openvpn-restart-interval-0s)
>                start interval=0s timeout=15 (openvpn-start-interval-0s)
>                stop interval=0s timeout=15 (openvpn-stop-interval-0s)
> 
> drbd resource is configured as follows:
> 
> root at node1# cat /etc/drbd.d/mail.res
> resource mail {
>    protocol  B;
>    device    /dev/drbd0;
>    disk      /dev/sys/mail;
>    meta-disk internal;
> 
>    net {
>      csums-alg sha1;
>      after-sb-0pri discard-zero-changes;
>      after-sb-1pri discard-secondary;
>      after-sb-2pri disconnect;
>      rr-conflict disconnect;
>    }
> 
>    handlers {
>      fence-peer            "/usr/lib/drbd/crm-fence-peer.sh";
>      after-resync-target   "/usr/lib/drbd/crm-unfence-peer.sh";
>      split-brain           "/usr/lib/drbd/notify-split-brain.sh admin at logit-ag.de";
>    }
> 
>    on node1 {
>      address 192.168.0.102:7789;
>    }
>    on node2 {
>      address 192.168.0.103:7789;
>    }
> }
> 
> Best regards,
> 
> -- 
>   \   / |			           |
>   (OvO) |  Mikhail Iwanow                   |
>   (^^^) |                                   |
>    \^/  |      E-mail:ivans at logit-ag.de    |
>    ^ ^  |                                   |
> 
> 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
>