[ClusterLabs] Active/Passive Cluster restarting resources on healthy node and DRBD issues
TEG AMJG
tegamjg at gmail.com
Fri Jul 22 21:07:16 UTC 2016
Hi
I am having a problem with a very simple Active/Passive cluster using DRBD.
This is my configuration:
Cluster Name: kamcluster
Corosync Nodes:
kam1vs3 kam2vs3
Pacemaker Nodes:
kam1vs3 kam2vs3
Resources:
Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2)
Attributes: ip=10.0.1.206 cidr_netmask=32
Operations: start interval=0s timeout=20s (ClusterIP-start-interval-0s)
stop interval=0s timeout=20s (ClusterIP-stop-interval-0s)
monitor interval=10s (ClusterIP-monitor-interval-10s)
Resource: ClusterIP2 (class=ocf provider=heartbeat type=IPaddr2)
Attributes: ip=10.0.1.207 cidr_netmask=32
Operations: start interval=0s timeout=20s (ClusterIP2-start-interval-0s)
stop interval=0s timeout=20s (ClusterIP2-stop-interval-0s)
monitor interval=10s (ClusterIP2-monitor-interval-10s)
Resource: rtpproxycluster (class=systemd type=rtpproxy)
Operations: monitor interval=10s (rtpproxycluster-monitor-interval-10s)
stop interval=0s on-fail=fence
(rtpproxycluster-stop-interval-0s)
Resource: kamailioetcfs (class=ocf provider=heartbeat type=Filesystem)
Attributes: device=/dev/drbd1 directory=/etc/kamailio fstype=ext4
Operations: start interval=0s timeout=60 (kamailioetcfs-start-interval-0s)
monitor interval=10s on-fail=fence
(kamailioetcfs-monitor-interval-10s)
stop interval=0s on-fail=fence
(kamailioetcfs-stop-interval-0s)
Clone: fence_kam2_xvm-clone
Meta Attrs: interleave=true clone-max=2 clone-node-max=1
Resource: fence_kam2_xvm (class=stonith type=fence_xvm)
Attributes: port=tegamjg_kam2 pcmk_host_list=kam2vs3
Operations: monitor interval=60s (fence_kam2_xvm-monitor-interval-60s)
Master: kamailioetcclone
Meta Attrs: master-max=1 master-node-max=1 clone-max=2 clone-node-max=1
notify=true
Resource: kamailioetc (class=ocf provider=linbit type=drbd)
Attributes: drbd_resource=kamailioetc
Operations: start interval=0s timeout=240 (kamailioetc-start-interval-0s)
promote interval=0s timeout=90
(kamailioetc-promote-interval-0s)
demote interval=0s timeout=90
(kamailioetc-demote-interval-0s)
stop interval=0s timeout=100 (kamailioetc-stop-interval-0s)
monitor interval=10s (kamailioetc-monitor-interval-10s)
Resource: kamailiocluster (class=ocf provider=heartbeat type=kamailio)
Attributes: listen_address=10.0.1.206 conffile=/etc/kamailio/kamailio.cfg
pidfile=/var/run/kamailio.pid monitoring_ip=10.0.1.206
monitoring_ip2=10.0.1.207 port=5060 proto=udp
kamctlrc=/etc/kamailio/kamctlrc
Operations: start interval=0s timeout=60
(kamailiocluster-start-interval-0s)
stop interval=0s on-fail=fence
(kamailiocluster-stop-interval-0s)
monitor interval=5s (kamailiocluster-monitor-interval-5s)
Clone: fence_kam1_xvm-clone
Meta Attrs: interleave=true clone-max=2 clone-node-max=1
Resource: fence_kam1_xvm (class=stonith type=fence_xvm)
Attributes: port=tegamjg_kam1 pcmk_host_list=kam1vs3
Operations: monitor interval=60s (fence_kam1_xvm-monitor-interval-60s)
Stonith Devices:
Fencing Levels:
Location Constraints:
Resource: kamailiocluster
Enabled on: kam1vs3 (score:INFINITY) (role: Started)
(id:cli-prefer-kamailiocluster)
Ordering Constraints:
start ClusterIP then start ClusterIP2 (kind:Mandatory)
(id:order-ClusterIP-ClusterIP2-mandatory)
start ClusterIP2 then start rtpproxycluster (kind:Mandatory)
(id:order-ClusterIP2-rtpproxycluster-mandatory)
start fence_kam2_xvm-clone then promote kamailioetcclone (kind:Mandatory)
(id:order-fence_kam2_xvm-clone-kamailioetcclone-mandatory)
promote kamailioetcclone then start kamailioetcfs (kind:Mandatory)
(id:order-kamailioetcclone-kamailioetcfs-mandatory)
start kamailioetcfs then start ClusterIP (kind:Mandatory)
(id:order-kamailioetcfs-ClusterIP-mandatory)
start rtpproxycluster then start kamailiocluster (kind:Mandatory)
(id:order-rtpproxycluster-kamailiocluster-mandatory)
start fence_kam1_xvm-clone then start fence_kam2_xvm-clone
(kind:Mandatory)
(id:order-fence_kam1_xvm-clone-fence_kam2_xvm-clone-mandatory)
Colocation Constraints:
rtpproxycluster with ClusterIP2 (score:INFINITY)
(id:colocation-rtpproxycluster-ClusterIP2-INFINITY)
ClusterIP2 with ClusterIP (score:INFINITY)
(id:colocation-ClusterIP2-ClusterIP-INFINITY)
ClusterIP with kamailioetcfs (score:INFINITY)
(id:colocation-ClusterIP-kamailioetcfs-INFINITY)
kamailioetcfs with kamailioetcclone (score:INFINITY)
(with-rsc-role:Master)
(id:colocation-kamailioetcfs-kamailioetcclone-INFINITY)
kamailioetcclone with fence_kam2_xvm-clone (score:INFINITY)
(id:colocation-kamailioetcclone-fence_kam2_xvm-clone-INFINITY)
kamailiocluster with rtpproxycluster (score:INFINITY)
(id:colocation-kamailiocluster-rtpproxycluster-INFINITY)
fence_kam2_xvm-clone with fence_kam1_xvm-clone (score:INFINITY)
(id:colocation-fence_kam2_xvm-clone-fence_kam1_xvm-clone-INFINITY)
Resources Defaults:
migration-threshold: 2
failure-timeout: 10m
resource-stickiness: 200
Operations Defaults:
No defaults set
Cluster Properties:
cluster-infrastructure: corosync
cluster-name: kamcluster
dc-version: 1.1.13-10.el7_2.2-44eb2dd
have-watchdog: false
last-lrm-refresh: 1469123600
no-quorum-policy: ignore
start-failure-is-fatal: false
stonith-action: reboot
stonith-enabled: true
The problem is that when i have only one node online in corosync and start
the other node to rejoin the cluster, all my resources restart and
sometimes even migrates to the other node (starting by changing in
promotion who is master and who is slave) even though the first node is
healthy and i use resource-stickiness=200 as a default in all resources
inside the cluster.
I do believe it has something to do with the constraint of promotion that
happens with DRBD.
Thank you very much in advance.
Regards.
Alejandro
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20160722/ea08082b/attachment-0003.html>
More information about the Users
mailing list